Project work as part of Udacity Bertelsmann Data Science Challenge
-
This dataset is one of the most popular datasets from Kaggle. The reason why I chose this dataset is because, there are lots of analysis available online. Hence, this dataset can be a good starting point for beginners.
-
Our task is to do further analysis with Titanic dataset, and see what information you can come up with from this dataset.
-
Feel free to use google to get ideas, and also try to understand how different analysis are driven out from this dataset.
-
Contribute atleast one meaningful analysis per person and provide steps to do it. (If possible do a pull request, and I'll allow commits to my repository. You can also provide your git repository link and I'll add your link in the first .README file.) If you are not able to do git, don't worry- we'll get there. Focus first on getting to run the .ipynb file.
-
We can put up a medium post with all different visuals that we are coming up with - Initial analysis of titanic dataset and add all our names as contributors. That can account for Social media presence for us! :)
- Anaconda (Link: https://www.anaconda.com/download/#linux)
- Install Anaconda Python 3.6 version for all OS
- Set up git and familiarize with git commands (A great course by Udacity: https://classroom.udacity.com/courses/ud775-india )
We can decide on a feasible timing during the weekend and go through everyone's contribution. That way we'll all be able to bring some insights to this dataset.
Once you are ready to begin please fill up this google sheet with what analysis you would be doing. https://docs.google.com/spreadsheets/d/1N6VcEDzF2y9JLhu6W67wfqcOoRE6jCP9uteWKBy0RKM/edit?usp=sharing This way we can minimize two people working on same analysis. Feel free to do some visual plot, derive insights, perform calculation. Well, the field is yours! :)
The more analysis we derive, the merrier it would be! :) Let us learn together!