This project focuses on harvesting data from YouTube using the YouTube API. The collected data is processed into a structured format, stored in a MySQL database, and visualized using a Streamlit application. The application provides an interactive interface to explore and analyze the data.
- YouTube API Integration: Extract detailed video and channel data directly from YouTube.
- Data Processing: Use
pandas
to clean, structure, and format data into a DataFrame for analysis. - Database Management: Store the processed data in a MySQL database for efficient retrieval and storage.
- Interactive Visualization: Build a Streamlit app to visualize key insights and trends in the data.
- Programming Language: Python
- Libraries:
pandas
for data manipulation and analysisgoogleapiclient
for interacting with the YouTube APIsqlalchemy
ormysql-connector
for database operationsstreamlit
for app development
- Database: MySQL
- Data Extraction:
- Use the YouTube API to fetch video and channel details.
- Extract data points such as title, views, likes, comments, and channel statistics.
- Data Processing:
- Clean and organize the extracted data into a DataFrame.
- Perform necessary transformations for database storage and visualization.
- Data Storage:
- Insert processed data into a MySQL database.
- Maintain efficient database structure for querying.
- Data Visualization:
- Create an interactive Streamlit dashboard.
- Display metrics such as most-viewed videos, channel performance, and trends.
- Add advanced filtering and search capabilities in the Streamlit app.
- Integrate additional APIs for enriched data insights.
- Implement user authentication for secure app access.