Welcome to my Data Engineering final Semester Portfolio Repository! This repository serves as a centralized platform for managing and showcasing my data engineering projects throughout the semester.
In this repository, you'll find a collection of projects and assessments that demonstrate my proficiency in various aspects of data engineering. Each project is designed to challenge and enhance my skills, covering topics such as data pipelines, ETL processes, Cloud, big data technologies, and more.
- Project 1: postgres_docker_init - This project sets up and tests PostgreSQL infrastructure using Docker and Docker Compose. It involves creating a Dockerized PostgreSQL server, loading data from a CSV file, and writing Python scripts to interact with the database. Includes detailed README documentation.
- Project 2: py_gcs_bq - This Python project enables seamless interaction with Google Cloud Storage (GCS) and BigQuery. It supports loading CSV files from a local machine into BigQuery and fetching API data to store in GCS, which is then loaded into BigQuery. The code is designed to be idempotent, reusable, and well-documented, with secrets managed via a .env file and constants through config.py. Includes detailed README documentation.
Collaboration is key to success in data engineering. If you have any suggestions, enhancements, or additional projects to contribute, feel free to fork this repository, make your changes, and submit a pull request. Your contributions are highly appreciated!
If you have any questions, feedback, or concerns, don't hesitate to reach out:
- Instructor: Emmanuel Ogunwede
- Student: Victor Ezeh