{{{credits}}}
L | T | P | C |
2 | 0 | 2 | 3 |
- To have a basic knowledge of the concepts and tools of machine learning.
- To learn the data pre-processing methods and be able to apply on dataset
- To understand the working of supervised and unsupervised algorithms.
- To learn the evaluation methods and apply them for validation
{{{unit}}}
Unit I | Introduction to Machine Learning | 9 |
Why Machine Learning: Problems Machine Learning Can Solve – Task and Data; scikit-learn; Essential Libraries and Tools: Jupyter Notebook – NumPy – SciPy – matplotlib – pandas – mglearn; Classifying Iris Species: Data – Measuring Success – Training and Testing Data – Building Model: k-Nearest Neighbors – Making Predictions – Evaluating the Model.
{{{unit}}}
Unit II | Data pre-processing | 9 |
Preprocessing and Scaling: Different Kinds of Preprocessing – Applying Data Transformations – Scaling Training and Test Data; Categorical Variables: One-Hot-Encoding – Numbers to Encode Categoricals; Binning, Discretization, Linear Models, and Trees; Interactions and Polynomials; Univariate Nonlinear Transformations; Automatic Feature Selection: Univariate Statistics – Model-Based Feature Selection – Iterative Feature Selection.
{{{unit}}}
Unit III | Supervised Learning | 9 |
Classification and Regression; Generalization, Overfitting, and Underfitting; Relation of Model Complexity to Dataset Size; Supervised Machine Learning Algorithms: Sample Datasets – k-Nearest Neighbors – Linear Models – Naive Bayes Classifiers – Decision Trees – Ensembles of Decision Trees – Bagging – Random Forests – Boosting – Neural Networks (Deep Learning).
{{{unit}}}
Unit IV | Unsupervised Learning | 9 |
Types of Unsupervised Learning; Challenges; Dimensionality Reduction: Principal Component Analysis (PCA); Clustering – k-Means Clustering Agglomerative Clustering – Comparing and Evaluating Clustering Algorithms.
{{{unit}}}
Unit V | Model Evaluation and Improvement. | 9 |
Cross-Validation: Cross-Validation in scikit-learn – Benefits of Cross-Validation – Stratified k-Fold Cross-Validation and Other Strategies; Grid Search: Simple Grid Search – The Danger of Overfitting the Parameters and the Validation Set – Grid Search with Cross-Validation; Evaluation Metrics and Scoring: Keep the End Goal in Mind – Metrics for Binary Classification – Metrics for Multiclass Classification – Regression Metrics – Using Evaluation Metrics in Model Selection.
- Perceptron and Linear Regression
- Multi-layer Perceptron
- Support Vector Machine
- Decision Tree algorithm
- k-Nearest Neighbor algorithm
- K-means clustering
\hfill Total: 60
After the completion of this course, students will be able to:
- Understand the basic concepts of machine learning (K2).
- Apply data preprocessing and feature selection for datasets (K4).
- Apply supervised and unsupervised techniques for different applications (K4).
- Evaluate and analyse the models using appropriate metrics (K4).
- Andreas C. Muller and Sarah Guido, “Introduction to Machine Learning with Python”, O’Reilly Media, 2016
- Aurelien Geron, “Hands-On Machine Learning with Scikit-Learn and TensorFlow”, O’Reilly Media, 2016
- Sebastian Raschka and Vahid Mirjalili, “Python Machine Learning”, Second Edition, Packt Publishing 2017.
- Stephen Marsland, “Machine Learning - An Algorithmic Perspective”, Second Edition, Chapman and Hall/CRC Machine Learning and Pattern Recognition Series, 2014.