This Repository is just for testing all the concepts learned during the nanodegree
In this project we are going to use the IBM telco churn dataset.
I work for a telco company and this dataset in interesting because churn is a common problem in telco environment. Aside of that, It is a good working ground, where I can use most of the concepts that are discussed on the nanodegree and use a formal approach to a data science project.
To run this code you need to install the requirement file in this repository using conda:
conda install requirements.txt
Here is the file list in the repository:
.
└── data_science_nanodegre
├── LICENSE
├── README.md
├── checklist.txt
├── data
│ ├── src : Contain the images that give context to the results
│ │ ├── competition.png
│ │ ├── img
│ │ │ ├── churn distribution.png
│ │ │ └── correlation.png
│ │ ├── offer E high churn.png
│ │ ├── output.png
│ │ └── payment ciclye.png
│ └── telco.csv : Dataset containing all the data used for the analysis
├── notebook
│ └── eda.ipynb : Notebook file containing all the analysis
└── requirements.txt : Requirement file, with all the libraries used on the project
The main findings of the code can be found at the post available here
Thank you IBM for providing a such a good dataset.
@misc{ibm_team_2024, title={Telco customer churn (11.1.3+)}, url={https://www.kaggle.com/dsv/8360350}, DOI={10.34740/KAGGLE/DSV/8360350}, publisher={Kaggle}, author={IBM Team}, year={2024} }