Skip to content

Latest commit

 

History

History
42 lines (28 loc) · 1.4 KB

README.md

File metadata and controls

42 lines (28 loc) · 1.4 KB

Covid-Data-Project


In this project we will analyse over some data available from Covid Dataset

To analyse the daily covid data received from GOI website and show that by transforming in the below format:


Date State Confirmed Recovered Deceased Total tested

Technological Stacks Implemented: Hadoop, Hive and Spark


Data Pipeline:


covid19 dp


Mapping: Below table shows us the mapping of the data and different files are used to decide where to get the data of each column,


Report Fields Source file Source field Rule
Date Raw Data 25,26,27,28 Date Announced Directly
State Raw Data 25,26,27,28 Detected State Directly
Confirmed Raw Data 25,26,27,28 Num Cases Aggregated on state
Recovered Raw Data 25,26,27,28 Num Cases Aggregated on state
Deceased Raw Data 25,26,27,28 Num Cases Aggregated on state
Total tested Raw Data 25,26,27,28 Total Tested Convert cumulative to daily count