PO1 | PO2 | PO3 | PO4 | PO5 | PO6 | PO7 | PO8 | PO9 | PO10 | PO11 | PO12 | PSO1 | PSO2 | PSO3 | ||
K3 | K4 | K5 | K5 | K6 | - | - | - | - | - | - | - | K3 | K3 | K6 | ||
CO1 | K2 | 2 | 2 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 | 1 |
CO2 | K3 | 3 | 2 | 2 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 3 | 1 |
CO3 | K4 | 3 | 3 | 2 | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 3 | 2 |
CO4 | K3 | 3 | 2 | 2 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 3 | 1 |
CO5 | K3 | 3 | 2 | 2 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 3 | 1 |
Score | 14 | 11 | 9 | 9 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 14 | 7 | |
Course Mapping | 3 | 3 | 2 | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 3 | 2 |
{{{credits}}}
L | T | P | C |
2 | 0 | 2 | 3 |
- To understand the competitive advantages of big data analytics
- To understand the distributed storage for big data
- To learn distributed method for processing of big data
- To understand how to represent unstructured data using NoSQL and processing
- To learn how statistical methods used for analyzing big data
{{{unit}}}
Unit I | Introduction to Big Data | 7 |
Introduction – Understanding Big Data – Big Data: Benifitting – Managing – Organizing and Analyzing Big Data: Learning and Analytics; Technology Challenges for Big Data.
{{{unit}}}
Unit II | HDFS | 9 |
Introduction – Distributed File System – Google File System – HDFS Design Goals – Using HDFS.
{{{unit}}}
Unit III | Data Processing using MapReduce | 10 |
Introduction – MapReduce Overview – Working of MapReduce – Programming – Writing and Testing MapReduce Programs
{{{unit}}}
Unit IV | NoSQL | 9 |
Introduction to NoSQL – Characteristics of NoSQL – NoSQL Storage Types – Advantages and Drawbacks - NoSQL Database Framework: Hive and HBase
{{{unit}}}
Unit V | Data Analysis | 10 |
Statistical Methods: Regression modelling – Multivariate analysis; Classification: SVM – Decision Trees; Linear Classifiers
Upon the completion of the course the students should be able to:
- Understand how to leverage the insights from big data analytics (K2)
- Understand and apply distributed computing for better storage of data (K3)
- Develop applications using Hadoop related tools(K4)
- Use database frameworks like Hive and Hbase for data analysis(K3)
- Solve applications using statistical and data analytic methods (K3)
- Parag Kulkarni, Sarang Joshi, “Big Data Anlytics”, PHI Learning Pvt. Ltd, 2016.
- Anil Maheshwari, “Big Data Essentials”, McGraw Hill, 2019
- Arshdeep Bahga, Vijay Madisetti, “Big Data Analytics: A Hands-On Approach”, Published by A Hands-on Approach Textbooks, 2016.
- Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, John Wiley & sons, 2012.
- Gaurav Vaish, “Getting Started with NoSQL”, Packt Publishing Ltd, 2013.
- E. Capriolo, D. Wampler, and J. Rutherglen, “Programming Hive”, O’Reilly, 2012.
- Lars George, “HBase: The Definitive Guide”, O’Reilly, 2011.
- Hadoop
- Applications using Map-Reduce programming (Examples: word count
/ frequency programs / matrix multiplication)
- R
- Linear and logistic Regression (Loan prediction using Credit approval dataset, Sales prediction using Bigmart dataset)
- SVM / Decision tree classification techniques (Flower type classification based on available attributes using Iris dataset, Passengers survival classification using titanic dataset)
- Clustering (Document categorization by multiclass techniques)
- Visualize data using any plotting framework
- Database
- Application that stores data in Hbase (Sentiment analysis using twitter dataset)
\hfill Total: 60