OPENCLASSROOMS - Data Scientist - Project 8
This repository contains files for a Big Data project that featurize images with a MobileNetV2 model using Google Cloud Platform.
The dataset used for this project is the Fruits 360, which includes more than tens of thousands images of fruits (size 100x100 pixels).
- notebook_local.ipynb : Images processing using PySpark (local instance)
- notebook_cloud.ipynb : Images processing using PySpark (notebook uploaded on GCP)
- pyspark_script.py : Images processing using PySpark (script run on GCP)
- projet8_presentation.pdf: Final presentation of the project
- Python 3.x
- Jupyter Notebook
- NumPy
- Pandas
- Tensorflow
- PySpark
- GCP : Storage, Dataproc