Hello! This is Josh Dev. In this series, we created an end-to-end data project that can be a good start for a data portfolio regardless if you are a beginner or an experienced data professional. This end-to-end project covers the major phases of a data project: from creation of data pipelines, visualizing and reporting data, and acquiring deeper insights.
This project covers the end-to-end business intelligence cycle.
- The data engineering side covered development of a two-step data pipeline:
- First step covers the extraction of data from online sources using Python and uploads the extracted files to an SFTP server
- In the second step, the uploaded files in SFTP will be downloaded and loaded to the data warehouse
- An API has been developed in order to use the data extracted from the web to support compliance requirement for AML screening
- In the data analysis side of the project, an exploratory analysis has been conducted to the car sales dataset and visualized the results using Power BI
- In the data science part of the project, a simple linear regression model has been built to analyze the mtcars dataset and to derive relevant insights
- Introduction and project overview (recording)
- Version control and virtual environment essentials for data professionals (recording)
- Extracting data to FTP using Python (recording | project)
- Loading CSV files from FTP to PostgreSQL using SSIS (recording | project)
- Developing screening API using FastAPI (recording | | project)
- Data modeling and visualization using Power BI (recording)
- Supervised machine learning and regression analysis primer (recording)
- Creating a machine learning pipeline on house price dataset using sklearn