The main objective of this project is to develop a robust classification model capable of identifying fraudulent vehicle insurance claims from genuine ones.
The dataset comprises information collected from various sources, including census bureaus, to analyze patterns in insurance claims. It consists of 38 features, including the target variable indicating the authenticity of the claim.
The project commenced with data collection from multiple sources, followed by extensive data validation and insertion into a SQLite database. The data was then segregated into valid and fraudulent categories based on predefined schema rules. Key steps involved in the project lifecycle include data preprocessing (addressing imbalanced datasets using Imblearn's Random Oversampler), clustering using K-means, model selection (evaluating multiple models and choosing the top performers based on accuracy and AUC scores), model building (utilizing XGBoost and SVC algorithms), hyperparameter optimization (using GridSearchCV), and ultimately deploying the models on Google Cloud Platform (GCP), Amazon Web Service (AWS), or Hiroku Platform. Extensive API testing using Postman and comprehensive logging at each stage ensured project transparency and traceability. The codebase adheres to object-oriented programming principles.
- Made with Python
- Libraries/Frameworks:
- Numpy
- Pandas
- Scikit-learn (sklearn)
- Scipy
- Seaborn
- Matplotlib
- Development Tools:
- PyCharm
- Jupyter Notebook
- Conda
- Other Tools:
- SQLite
- Flask
- Postman
- Additional Resources:
- Stack Overflow
- HTML
The project is developed using Python 3.9 or above. If you don't have Python installed, you can download it from the official website. For users with older Python versions, upgrading is possible using the pip package manager. To install all necessary packages and libraries, follow these steps after cloning the repository:
conda create -n myenv python=3.7
conda activate myenv
pip install -r requirements.txt
Code
python main.py
Ensure that the directory is set to the root folder of the project.