{{{credits}}}
L | T | P | C |
3 | 0 | 0 | 3 |
- To learn fundamentals of Data Science using Python
- To understand probability distributions and statistical Inferences
- To be familar with supervised and unsupervised methods in machine learning
- To explore the algorithms used for analysing massive data problems and social networks
- To learn about topic and graphical models.
{{{unit}}}
UNIT I | DATA SCIENCE AND PYTHON | 9 |
Introduction: Computational tools – Need for data science – Causality and experiments; Array Computing in Python: Vectors – Arrays – Advanced vectorization of functions – Higher-dimensional Arrays: Matrices and arrays; Dictionaries and Strings.
{{{unit}}}
UNIT II | PROBABILITY AND STATISTICS | 9 |
Randomness – Empirical Distributions – Testing Hypothesis – Estimation – Why the mean matters – Prediction – Inference for Regression.
{{{unit}}}
UNIT III | MACHINE LEARNING | 9 |
Perceptron algorithm – Kernel functions – Overfitting and uniform convergence – Regularization – Support Vector Machines – Strong and weak learning – Stochastic Gradient Descent.
{{{unit}}}
UNIT IV | DATA STREAMS AND CLUSTERING | 9 |
Algorithms for Massive Data Problems: Frequency moments of data streams – Matrix algorithms using sampling; Clustering: k-Means clustering – Spectral clustering – Community finding and graph partitioning.
{{{unit}}}
UNIT V | TOPIC MODELS AND GRAPHICAL MODELS | 9 |
Topic Models – Nonnegative matrix factorization – Latent Dirichlet allocation – Hidden Markov models – Bayesian Belief Networks – Markov Random Fields.
\hfill Total Periods: 45
After the completion of this course, students will be able to:
- Develop Python programs to perform analysis on data (K3)
- Understand various probability distributions and statistical inferences (K2)
- Develop applications to demonstrate machine learning algorithms in practice (K3)
- Understand the principles of handling data streams (K2)
- Discuss topic and graphical modeling techniques in real world problem (K2).
- Ani Adhikari, John DeNero, “Computational and Inferential Thinking: The Foundations of Data Science”, GitBook, 2017. (Unit- I, II)
- Avrim Blum, John Hopcroft, Ravindran Kannan, “Foundations of Data Science”, Vorabversion eines Lehrbuchs, 2016. (Unit-III, IV, V)
- Hans Petter Langtangen, “A Primer on Scientific Programming with Python”, 4th Edition, Springer, 2016. (Unit - I).
- Jonathan Dinu, “Foundations of Data Science: A Practical Introduction to Data Science with Python”, Addison-wesley Data & Analytics Series, 2016.
- Jure Leskovek, Anand Rajaraman, Jeffrey Ullman, “Mining of Massive Datasets”, V2.1, Cambridge University Press, 2014.
- EMC Education Services, “Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data”, Wiley publishers, 2015.
- Cathy O’Neil, Rachel Schutt. “Doing Data Science, Straight Talk From The Frontline”, O’Reilly, 2014.
PO1 | PO2 | PO3 | PO4 | PO5 | PO6 | PO7 | PO8 | PO9 | PO10 | PO11 | PO12 | PSO1 | PSO2 | PSO3 | ||
K3 | K6 | K6 | K6 | K6 | - | - | - | - | - | - | - | K6 | K5 | K6 | ||
CO1 | K3 | 3 | 2 | 2 | ||||||||||||
CO2 | K2 | 2 | 1 | 1 | ||||||||||||
CO3 | K3 | 3 | 2 | 2 | 2 | |||||||||||
CO4 | K2 | 2 | 1 | 1 | ||||||||||||
CO5 | K2 | 2 | 1 | 1 | ||||||||||||
Score | 12 | 7 | 2 | 7 | ||||||||||||
Course Mapping | 3 | 2 | 2 | 2 |