Skip to content

Latest commit

 

History

History
48 lines (29 loc) · 1.79 KB

README.md

File metadata and controls

48 lines (29 loc) · 1.79 KB

Sparkify

This repository is the work for my capstone project from the Udacity Data Scientist Nanodegree Program. In this project, I will analyze the data from Sparkify to predict customer churn.

Sparkify is a simulation data of a subscription-based company that provide music service like Spotify, Apple Music, etc. Customer churn prediction is a very challenging and common task for a data scientist or analyst to improve a company's business. Processing and analyzing a large amount of data with Spark is also a must-have skill in the data fields.

🚀 Table of contents

  1. Prerequisites
  2. Project Motivation
  3. Instructions
  4. Results
  5. Acknowledgements

Prerequisites

These are libraries that is used in this project:

  • PySpark

Instructions

  1. Install PySpark
  2. Run the notebook Sparkify.ipynb

Results

The findings of this project has been published here.

Acknowledgements

This project use disaster data from Sparkify.

The code is inspired by Udacity Data Scientist Nanodegree Program.

🔨 Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project.
  2. Create your Feature Branch (git checkout -b feature/Feature).
  3. Commit your Changes (git commit -m 'Add some feature').
  4. Push to the Branch (git push origin feature/Feature).
  5. Open a Pull Request.

📫 Contact