Zomato-Data-Analysis

Food Delivery App Data Analysis using Python and Integrating the data to the target database(Oracle)

Description

This project aims to perform data analysis on Zomato restaurant data and create a pipeline for storing the cleaned data in an Oracle Database. The pipeline is designed to update the target file in the database whenever there is a modification in the source file.

Files in the Repository

The repository contains the following files:

zomato.csv
- This file contains the uncleaned data extracted from Zomato.
cleaning_script
- This script is responsible for cleaning and preprocessing the Zomato data.
pipeline_Auto
- The dynamic pipeline script that stores the cleaned data into the Oracle Database. Users need to provide necessary details through the command line for execution. Example:
```
python pipeline_Auto.py <oracle_user> <oracle_password> <oracle_host> <oracle_port> <oracle_sid> <csv_file_path>
```
runQuery.py
- This script reads a query written in a .txt file and outputs the result in a .csv file. Similar to the pipeline script, it is designed to be dynamic and accepts input through the command line. Example:
```
python runQuery.py <oracle_user> <oracle_password> <oracle_host> <oracle_port> <oracle_sid> <query_file_path>
```

Installation

To use this project, you need the following dependencies:

Python 3.x: If you don't have Python 3.x installed, you can download it from the Python official website.
Oracle Database: Ensure you have access to an Oracle Database where you can store the cleaned data.
Python packages: You'll need to install the following Python packages:
- pandas: Used for data manipulation and analysis.
- cx_Oracle: Required for database connectivity with Oracle Database.

To install the Python packages, you can use the following commands:

# Install pandas
pip install pandas

# Install cx_Oracle
pip install cx_Oracle

Usage

Cleaning Script:

Execute the cleaning script on the zomato.csv file to prepare the data for further analysis.

Pipeline Script:

Run the pipeline script, providing Oracle database credentials and the file paths as command-line arguments to store the cleaned data.

Example:

python pipeline_Auto.py user pass host port sid path/to/zomato.csv

Query Execution:

Use runQuery.py to execute queries on the Oracle database. Provide the query file path and the desired output CSV file path.

Example:

python runQuery.py user pass host port sid path/to/query.txt

Contributing

If you would like to contribute to this project, please follow these steps:

Fork the repository.
Create a new branch for your feature or bug fix.
Make changes and commit them.
Push to your fork and submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
QueryInput.txt		QueryInput.txt
README.md		README.md
cleaning_script.py		cleaning_script.py
pipeline_Auto.py		pipeline_Auto.py
runQuery.py		runQuery.py
zomato.csv		zomato.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zomato-Data-Analysis

Description

Files in the Repository

Installation

Usage

Contributing

About

Releases

Packages

Languages

Sayanss99/Zomato-Data-Analysis

Folders and files

Latest commit

History

Repository files navigation

Zomato-Data-Analysis

Description

Files in the Repository

Installation

Usage

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages