As people tend to spend more time on their mobile apps, it also becomes harder for developers/businesses to develop a popular app.
In this project, I will try to predict the number of installs (target variable) from some features of the app itself using machine learning models. I am trying to find out what kind of apps are more popular and tend to stay longer in people’s phones. Number of installs is divided into four groups, 0, 1, 2, and 3. This is a classification problem. Four models are used: Random forest, SVC, k-NN, and XGBoost.
- Utility codes such as preprocessing codes and model implementation pipelines are saved in src folder
- Prediction results and EDA plots are represented as notebook files and saved in results folder
- Source data can be found in data folder
- Project details and learnings can be found in reports folder
- Plots can be found in rfigures folder
This project will require recent installations of
- ipython==7.6.1
- jupyter==1.0.0
- jupyter-client==5.3.1
- jupyter-console==6.0.0
- jupyter-core==4.5.0
- jupyterlab==1.0.2
- jupyterlab-server==1.0.0
- jupytext==1.2.4
- matplotlib==3.1.0
- numpy==1.16.4
- pandas==0.24.2
- scikit-learn==0.21.2
- scipy==1.3.0
- seaborn==0.9.0
- xgboost==0.90
- shap==0.32.1