diff --git a/README.md b/README.md
index ec0df2c..192a0fa 100644
--- a/README.md
+++ b/README.md
@@ -2,39 +2,68 @@
## Overview
-This git repository contains code and configurations for implementing a Convolutional Neural Network to classify images containing cats or dogs. The data was sourced from the [dogs-vs-cats](https://www.kaggle.com/competitions/dogs-vs-cats/overview) Kaggle competition, and also from [freeimages.com](https://www.freeimages.com/) using a web scraper. Docker containers were used to deploy the application on an EC2 spot instances in order to scale up hardware and computation power.
+This git repository contains code and configurations for implementing a Convolutional Neural Network to classify images containing cats or dogs. The data was sourced from the [dogs-vs-cats](https://www.kaggle.com/competitions/dogs-vs-cats/overview) Kaggle competition, and also from [freeimages.com](https://www.freeimages.com/) using a web scraper.
-## Repo Contents
+Two models were trained to classify the images; an AlexNet8 model via Keras and a VGG16 model via Torch.
-* The __aws__ subdirectory contains batch and shell scripts for configuring ec2 spot instances and the deploying docker container remotely.
-* The __conda__ subdirectory contains batch and shell scripts for creating a local conda environment for the project.
-* The __data_prep__ subdirectory contains python utility scripts to data cleansing and processing for modelling.
-* The __kaggle__ subdirectory contains python scripts for downloading and unzipping competition data from Kaggle.
-* The __model__ subdirectory contains python scripts for initiating and training CNN models.
-* The __ref__ subdirectory contains previous analysis and kernals on dogs vs cats classification from Kaggle community members.
-* The __report__ subdirectory contains reportable images and plots generated by the application.
-* The __webscrapers__ subdirectory contains webscraping tools for downloading cats and dogs images from [freeimages.com](https://www.freeimages.com/).
+Docker containers were used to deploy the application on an EC2 spot instances in order to scale up hardware and computation power.
-## Application Scripts
+![Workflow](doc/catclassifier.jpg)
-The main dog and cat image classification application is contained within the root scripts:
+## Analysis Results
-* The __01_prg_kaggle_data.py__ script downloads / unzips the cat vs dogs competition data.
-* The __02_prg_scrape_imgs.py__ script scrapes additional cat and dog images from [freeimages.com](https://www.freeimages.com/).
-* The __03_prg_keras_model.py__ script trains, fits and makes image predictions of the cat and dog images using a CNN model.
-* The __analysis_results.ipynb__ file contains a high level summary aof the analysis results.
-* The __cons.py__ script contains programme constants and configurations.
-* The __Dockerfile__ builds the application container for deployment on ec2.
-* The __exeDocker.bat__ executes the Docker build process locally on windows.
-* The __requirements.txt__ file contains the python package dependencies for the application.
+The images were further normalised using rotations, scaling, zooming, flipping and shearing prior to the modelling training phase.
-## Analysis Results
+![Generator Plot](report/torch/generator_plot.jpg)
+
+Models were trained across 10 to 25 epochs using stochastic gradient descent and cross entropy loss. Learning rate reduction on plateau and early stopping were implemented as part of training procedure.
+
+![Predicted Images](report/torch/pred_images.jpg)
+
+See the analysis results notebook for a further details on the analysis; including CNN architecture and model performance.
+
+* https://nbviewer.org/github/oislen/CatClassifier/blob/main/report/torch_analysis_results.ipynb
+
+## Running the Application (Windows)
+
+### Anaconda
+
+Create a local conda environment for the Cat Classifier app using [anaconda](https://www.anaconda.com/):
+
+```
+conda create --name CatClassifier python=3.12 --yes
+conda activate CatClassifier
+pip install -r requirements.txt
+```
+
+Execute the webscrapers and model training pipeline using the following commands and the local conda environment:
+
+```
+:: run webscrapers
+python webscrapers/prg_scrape_imgs.py --run_download_comp_data --run_webscraper
+:: run model training pipeline
+python model/prg_torch_model.py --run_model_training --run_testset_prediction
+```
+
+The model training and evaluation report can be opened with:
+
+```
+jupyter lab --ip=0.0.0.0 --allow-root "report/torch_analysis_results.ipynb"
+```
+### Docker
+
+The latest version of the Cat Classifier app can be found as a [docker](https://www.docker.com/) image on dockerhub here:
+
+* https://hub.docker.com/repository/docker/oislen/cat-classifier
-See the analysis results notebook for a summary of the project; including image processing, CNN architecture and model performance.
-* https://nbviewer.org/github/oislen/CatClassifier/blob/main/notebooks/torch_analysis_results.ipynb
+The image can be pulled from dockerhub using the following command:
-## Docker Container
+```
+docker pull oislen/cat-classifier:latest
+```
-The application docker container is available on dockerhub here:
+The Cat Classifier app can then be started within a jupyter lab session using the following command and the docker image:
-https://hub.docker.com/repository/docker/oislen/cat-classifier
+```
+docker run --name cc --shm-size=512m --publish 8888:8888 -it oislen/cat-classifier:latest
+```
\ No newline at end of file
diff --git a/doc/catclassifier.drawio b/doc/catclassifier.drawio
new file mode 100644
index 0000000..85f692f
--- /dev/null
+++ b/doc/catclassifier.drawio
@@ -0,0 +1,94 @@
+
Model: \"AlexNet8\"\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1mModel: \"AlexNet8\"\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n", + "┃ Layer (type) ┃ Output Shape ┃ Param # ┃\n", + "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n", + "│ input_layer (InputLayer) │ (None, 128, 128, 3) │ 0 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d (Conv2D) │ (None, 30, 30, 96) │ 34,944 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ max_pooling2d (MaxPooling2D) │ (None, 14, 14, 96) │ 0 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d_1 (Conv2D) │ (None, 14, 14, 256) │ 614,656 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ max_pooling2d_1 (MaxPooling2D) │ (None, 6, 6, 256) │ 0 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d_2 (Conv2D) │ (None, 6, 6, 384) │ 885,120 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d_3 (Conv2D) │ (None, 6, 6, 384) │ 1,327,488 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d_4 (Conv2D) │ (None, 6, 6, 256) │ 884,992 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ max_pooling2d_2 (MaxPooling2D) │ (None, 2, 2, 256) │ 0 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ flatten (Flatten) │ (None, 1024) │ 0 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dense (Dense) │ (None, 4096) │ 4,198,400 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dropout (Dropout) │ (None, 4096) │ 0 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dense_1 (Dense) │ (None, 4096) │ 16,781,312 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dropout_1 (Dropout) │ (None, 4096) │ 0 │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dense_2 (Dense) │ (None, 2) │ 8,194 │\n", + "└─────────────────────────────────┴────────────────────────┴───────────────┘\n", + "\n" + ], + "text/plain": [ + "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n", + "┃\u001b[1m \u001b[0m\u001b[1mLayer (type) \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mOutput Shape \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m Param #\u001b[0m\u001b[1m \u001b[0m┃\n", + "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n", + "│ input_layer (\u001b[38;5;33mInputLayer\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m128\u001b[0m, \u001b[38;5;34m128\u001b[0m, \u001b[38;5;34m3\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d (\u001b[38;5;33mConv2D\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m30\u001b[0m, \u001b[38;5;34m30\u001b[0m, \u001b[38;5;34m96\u001b[0m) │ \u001b[38;5;34m34,944\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ max_pooling2d (\u001b[38;5;33mMaxPooling2D\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m14\u001b[0m, \u001b[38;5;34m14\u001b[0m, \u001b[38;5;34m96\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d_1 (\u001b[38;5;33mConv2D\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m14\u001b[0m, \u001b[38;5;34m14\u001b[0m, \u001b[38;5;34m256\u001b[0m) │ \u001b[38;5;34m614,656\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ max_pooling2d_1 (\u001b[38;5;33mMaxPooling2D\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m6\u001b[0m, \u001b[38;5;34m6\u001b[0m, \u001b[38;5;34m256\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d_2 (\u001b[38;5;33mConv2D\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m6\u001b[0m, \u001b[38;5;34m6\u001b[0m, \u001b[38;5;34m384\u001b[0m) │ \u001b[38;5;34m885,120\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d_3 (\u001b[38;5;33mConv2D\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m6\u001b[0m, \u001b[38;5;34m6\u001b[0m, \u001b[38;5;34m384\u001b[0m) │ \u001b[38;5;34m1,327,488\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ conv2d_4 (\u001b[38;5;33mConv2D\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m6\u001b[0m, \u001b[38;5;34m6\u001b[0m, \u001b[38;5;34m256\u001b[0m) │ \u001b[38;5;34m884,992\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ max_pooling2d_2 (\u001b[38;5;33mMaxPooling2D\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m2\u001b[0m, \u001b[38;5;34m2\u001b[0m, \u001b[38;5;34m256\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ flatten (\u001b[38;5;33mFlatten\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m1024\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dense (\u001b[38;5;33mDense\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m4096\u001b[0m) │ \u001b[38;5;34m4,198,400\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dropout (\u001b[38;5;33mDropout\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m4096\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dense_1 (\u001b[38;5;33mDense\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m4096\u001b[0m) │ \u001b[38;5;34m16,781,312\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dropout_1 (\u001b[38;5;33mDropout\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m4096\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n", + "├─────────────────────────────────┼────────────────────────┼───────────────┤\n", + "│ dense_2 (\u001b[38;5;33mDense\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m2\u001b[0m) │ \u001b[38;5;34m8,194\u001b[0m │\n", + "└─────────────────────────────────┴────────────────────────┴───────────────┘\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "
Total params: 24,735,108 (94.36 MB)\n", + "\n" + ], + "text/plain": [ + "\u001b[1m Total params: \u001b[0m\u001b[38;5;34m24,735,108\u001b[0m (94.36 MB)\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "
Trainable params: 24,735,106 (94.36 MB)\n", + "\n" + ], + "text/plain": [ + "\u001b[1m Trainable params: \u001b[0m\u001b[38;5;34m24,735,106\u001b[0m (94.36 MB)\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "
Non-trainable params: 0 (0.00 B)\n", + "\n" + ], + "text/plain": [ + "\u001b[1m Non-trainable params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "
Optimizer params: 2 (12.00 B)\n", + "\n" + ], + "text/plain": [ + "\u001b[1m Optimizer params: \u001b[0m\u001b[38;5;34m2\u001b[0m (12.00 B)\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# load trained keras model\n", + "model = keras.models.load_model(cons.keras_model_pickle_fpath)\n", + "# print model summary\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "id": "b81cdf44-b643-400e-8f14-556384ba9ad0", + "metadata": {}, + "source": [ + "## Model Performance" + ] + }, + { + "cell_type": "markdown", + "id": "f771965b-b74f-4363-aa95-242038dcd235", + "metadata": {}, + "source": [ + "The model was trained across 25 epochs. Learning rate reduction on plateau and early stopping were implemented as part of training procedure.The model accuracy and loss are plotted below across the training and validation sets." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "52070b1f-4162-4bb8-a082-534a7b858335", + "metadata": {}, + "source": [ + "![Model Accuaracy](../report/keras/model_accuracy.png)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "c9d58fe1-fce7-4754-a35a-10a7325b7bdd", + "metadata": {}, + "source": [ + "![Model Loss](../report/keras/model_loss.png)" + ] + }, + { + "cell_type": "markdown", + "id": "990e17f5-70dc-4542-8c24-83c52bcc3859", + "metadata": {}, + "source": [ + "## Model Image Predictions" + ] + }, + { + "cell_type": "markdown", + "id": "f307af91-0b9f-486f-bc8d-f8acb3845949", + "metadata": {}, + "source": [ + "The model predictions were made for the Kaggle test set, see below example model predictions." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "d8856811-a69b-4f4c-a2b6-9d528fcfb75e", + "metadata": {}, + "source": [ + "![Predicted Images](../report/keras/pred_images.jpg)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "catclass", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.8" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/report/qmarkdown/keras_analysis_results.qmd b/report/keras_analysis_results.qmd similarity index 57% rename from report/qmarkdown/keras_analysis_results.qmd rename to report/keras_analysis_results.qmd index 29aa3b2..4fbae08 100644 --- a/report/qmarkdown/keras_analysis_results.qmd +++ b/report/keras_analysis_results.qmd @@ -1,5 +1,5 @@ --- -title: "Analysis Results" +title: "Keras Analysis Results" format: html: toc: true @@ -14,6 +14,11 @@ jupyter: python3 # Cats vs Dogs Image Classification ```{python} +import sys + +sys.path.append("../model") +import cons + from tensorflow import keras ``` @@ -21,35 +26,37 @@ This project aims to create a model to classify cat and dog images. The data was ## Example Image -![Random Image](report/keras/random_image.jpg) +![Random Image](keras/random_image.jpg) ## Data Processing The images were further processed using rotations, scaling, zooming, flipping and shearing prior to the modelling training phase. See example image processing below. -![Generator Plot](report/keras/generator_plot.jpg) +![Generator Plot](keras/generator_plot.jpg) + +## AlexNet8 Model Architecture -## AlexNet8 Model Archecture +An AlexNet CNN model with 8 layers was trained using the processed images via Keras. See AlexNet diagram below, as well as Keras model summary. Stochastic gradient descent was implemented to optimize the training criterion function cross entropy loss. -An AlexNet CNN model with 8 layers was trained using the processed images via Keras. See AlexNet diagram below, as well as keras model summary. +![AlexNet Architecture](keras/AlexNet8_architecture.png) ```{python} # load trained keras model -model = keras.models.load_model('data/keras_model.h5') +model = keras.models.load_model(cons.keras_model_pickle_fpath) # print model summary model.summary() ``` ## Model Performance -The model was trained across 25 epochs. The model accuracy and loss are plotted below across the training and validation sets. +The model was trained across 25 epochs. Learning rate reduction on plateau and early stopping were implemented as part of training procedure.The model accuracy and loss are plotted below across the training and validation sets. -![Model Accuaracy](report/keras/model_accuracy.png) +![Model Accuracy](keras/model_accuracy.png) -![Model Loss](report/keras/model_loss.png) +![Model Loss](keras/model_loss.png) ## Model Image Predictions The model predictions were made for the Kaggle test set, see below example model predictions. -![Predicted Images](report/keras/pred_images.jpg) +![Predicted Images](keras/pred_images.jpg) diff --git a/report/notebooks/keras_analysis_results.ipynb b/report/notebooks/keras_analysis_results.ipynb deleted file mode 100644 index 545712d..0000000 --- a/report/notebooks/keras_analysis_results.ipynb +++ /dev/null @@ -1,244 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "a9219296-311e-4135-8a3a-8c4320624286", - "metadata": {}, - "source": [ - "# Cats vs Dogs Image Classification" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "6424bfa9-adfd-48bc-a203-b6242008dd3a", - "metadata": {}, - "outputs": [], - "source": [ - "from tensorflow import keras" - ] - }, - { - "cell_type": "markdown", - "id": "05783f63-8fb2-47d0-b88a-8fc91842bd90", - "metadata": {}, - "source": [ - "This project aims to create a model to classify cat and dog images. The data was sourced from the [dogs-vs-cats](https://www.kaggle.com/competitions/dogs-vs-cats/overview) Kaggle competition, and also from [freeimages.com](https://www.freeimages.com/) using a web scraper. Docker containers were used to deploy the application on an EC2 spot instances in order to scale up hardware and computation power. " - ] - }, - { - "cell_type": "markdown", - "id": "1c739375-4438-4ee1-8889-3bec76b070a7", - "metadata": {}, - "source": [ - "## Example Image" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "e9d44483-13ac-4013-8d3e-1573872f005a", - "metadata": {}, - "source": [ - "![Random Image](../report/keras/random_image.jpg)" - ] - }, - { - "cell_type": "markdown", - "id": "93a586a0-abe8-4e87-8125-3a0761ecac49", - "metadata": {}, - "source": [ - "## Data Processing" - ] - }, - { - "cell_type": "markdown", - "id": "cb0ec205-434c-4fe5-a4f0-26f811f25761", - "metadata": {}, - "source": [ - "The images were further processed using rotations, scaling, zooming, flipping and shearing prior to the modelling training phase. See example image processing below. " - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "7d53af71-c2ee-4ed1-8335-5565dce7951a", - "metadata": {}, - "source": [ - "![Generator Plot](../report/keras/generator_plot.jpg)" - ] - }, - { - "cell_type": "markdown", - "id": "9ac34849-f97f-4086-99ca-c8ab2ba5d77e", - "metadata": {}, - "source": [ - "## AlexNet8 Model Archecture" - ] - }, - { - "cell_type": "markdown", - "id": "a7b1dcff-9319-4973-9659-fda87cb85481", - "metadata": {}, - "source": [ - "An AlexNet CNN model with 8 layers was trained using the processed images via Keras. See AlexNet diagram below, as well as keras model summary." - ] - }, - { - "cell_type": "markdown", - "id": "f65e6c9c-c63f-41c2-90f0-2c1321881b81", - "metadata": {}, - "source": [ - "![AlexNet Architecture](../report/keras/AlexNet8_archecture.png)" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "6f790b84-1255-42fb-b07c-37065ab49c4d", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Model: \"AlexNet8\"\n", - "_________________________________________________________________\n", - " Layer (type) Output Shape Param # \n", - "=================================================================\n", - " input_1 (InputLayer) [(None, 128, 128, 3)] 0 \n", - " \n", - " conv2d (Conv2D) (None, 30, 30, 96) 34944 \n", - " \n", - " max_pooling2d (MaxPooling2D (None, 14, 14, 96) 0 \n", - " ) \n", - " \n", - " conv2d_1 (Conv2D) (None, 14, 14, 256) 614656 \n", - " \n", - " max_pooling2d_1 (MaxPooling (None, 6, 6, 256) 0 \n", - " 2D) \n", - " \n", - " conv2d_2 (Conv2D) (None, 6, 6, 384) 885120 \n", - " \n", - " conv2d_3 (Conv2D) (None, 6, 6, 384) 1327488 \n", - " \n", - " conv2d_4 (Conv2D) (None, 6, 6, 256) 884992 \n", - " \n", - " max_pooling2d_2 (MaxPooling (None, 2, 2, 256) 0 \n", - " 2D) \n", - " \n", - " flatten (Flatten) (None, 1024) 0 \n", - " \n", - " dense (Dense) (None, 4096) 4198400 \n", - " \n", - " dropout (Dropout) (None, 4096) 0 \n", - " \n", - " dense_1 (Dense) (None, 4096) 16781312 \n", - " \n", - " dropout_1 (Dropout) (None, 4096) 0 \n", - " \n", - " dense_2 (Dense) (None, 2) 8194 \n", - " \n", - "=================================================================\n", - "Total params: 24,735,106\n", - "Trainable params: 24,735,106\n", - "Non-trainable params: 0\n", - "_________________________________________________________________\n" - ] - } - ], - "source": [ - "# load trained keras model\n", - "model = keras.models.load_model('../data/keras_model.h5')\n", - "# print model summary\n", - "model.summary()" - ] - }, - { - "cell_type": "markdown", - "id": "b81cdf44-b643-400e-8f14-556384ba9ad0", - "metadata": {}, - "source": [ - "## Model Performance" - ] - }, - { - "cell_type": "markdown", - "id": "f771965b-b74f-4363-aa95-242038dcd235", - "metadata": {}, - "source": [ - "The model was trained across 25 epochs. The model accuracy and loss are plotted below across the training and validation sets." - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "52070b1f-4162-4bb8-a082-534a7b858335", - "metadata": {}, - "source": [ - "![Model Accuaracy](../report/keras/model_accuracy.png)" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "c9d58fe1-fce7-4754-a35a-10a7325b7bdd", - "metadata": {}, - "source": [ - "![Model Loss](../report/keras/model_loss.png)" - ] - }, - { - "cell_type": "markdown", - "id": "990e17f5-70dc-4542-8c24-83c52bcc3859", - "metadata": {}, - "source": [ - "## Model Image Predictions" - ] - }, - { - "cell_type": "markdown", - "id": "f307af91-0b9f-486f-bc8d-f8acb3845949", - "metadata": {}, - "source": [ - "The model predictions were made for the Kaggle test set, see below example model predictions." - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "d8856811-a69b-4f4c-a2b6-9d528fcfb75e", - "metadata": {}, - "source": [ - "![Predicted Images](../report/keras/pred_images.jpg)" - ] - }, - { - "cell_type": "markdown", - "id": "ee25817b", - "metadata": {}, - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.3" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/report/notebooks/torch_analysis_results.ipynb b/report/notebooks/torch_analysis_results.ipynb deleted file mode 100644 index 196438b..0000000 --- a/report/notebooks/torch_analysis_results.ipynb +++ /dev/null @@ -1,260 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "a9219296-311e-4135-8a3a-8c4320624286", - "metadata": {}, - "source": [ - "# Cats vs Dogs Image Classification" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "6424bfa9-adfd-48bc-a203-b6242008dd3a", - "metadata": {}, - "outputs": [], - "source": [ - "import sys\n", - "sys.path.append('..')\n", - "import cons\n", - "\n", - "import torch\n", - "from model.torch.VGG16_pretrained import VGG16_pretrained" - ] - }, - { - "cell_type": "markdown", - "id": "05783f63-8fb2-47d0-b88a-8fc91842bd90", - "metadata": {}, - "source": [ - "This project aims to create a model to classify cat and dog images. The data was sourced from the [dogs-vs-cats](https://www.kaggle.com/competitions/dogs-vs-cats/overview) Kaggle competition, and also from [freeimages.com](https://www.freeimages.com/) using a web scraper. Docker containers were used to deploy the application on an EC2 spot instances in order to scale up hardware and computation power. " - ] - }, - { - "cell_type": "markdown", - "id": "1c739375-4438-4ee1-8889-3bec76b070a7", - "metadata": {}, - "source": [ - "## Example Image" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "e9d44483-13ac-4013-8d3e-1573872f005a", - "metadata": {}, - "source": [ - "![Random Image](../report/torch/random_image.jpg)" - ] - }, - { - "cell_type": "markdown", - "id": "93a586a0-abe8-4e87-8125-3a0761ecac49", - "metadata": {}, - "source": [ - "## Data Processing" - ] - }, - { - "cell_type": "markdown", - "id": "cb0ec205-434c-4fe5-a4f0-26f811f25761", - "metadata": {}, - "source": [ - "The images were resized to a uniform dimension prior to the modelling training phase. See example image processing below. " - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "7d53af71-c2ee-4ed1-8335-5565dce7951a", - "metadata": {}, - "source": [ - "![Generator Plot](../report/torch/generator_plot.jpg)" - ] - }, - { - "cell_type": "markdown", - "id": "9ac34849-f97f-4086-99ca-c8ab2ba5d77e", - "metadata": {}, - "source": [ - "## VGG16 Model Archecture" - ] - }, - { - "cell_type": "markdown", - "id": "a7b1dcff-9319-4973-9659-fda87cb85481", - "metadata": {}, - "source": [ - "A pretrained VGG CNN model with 16 layers was trained using the processed images via pytorch. See VGG16 diagram below, as well as torch model summary." - ] - }, - { - "cell_type": "markdown", - "id": "f65e6c9c-c63f-41c2-90f0-2c1321881b81", - "metadata": {}, - "source": [ - "![AlexNet Architecture](../report/torch/VGG16_archecture.png)" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "6f790b84-1255-42fb-b07c-37065ab49c4d", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "VGG16_pretrained(\n", - " (resnet): VGG(\n", - " (features): Sequential(\n", - " (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (1): ReLU(inplace=True)\n", - " (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (3): ReLU(inplace=True)\n", - " (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", - " (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (6): ReLU(inplace=True)\n", - " (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (8): ReLU(inplace=True)\n", - " (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", - " (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (11): ReLU(inplace=True)\n", - " (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (13): ReLU(inplace=True)\n", - " (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (15): ReLU(inplace=True)\n", - " (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", - " (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (18): ReLU(inplace=True)\n", - " (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (20): ReLU(inplace=True)\n", - " (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (22): ReLU(inplace=True)\n", - " (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", - " (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (25): ReLU(inplace=True)\n", - " (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (27): ReLU(inplace=True)\n", - " (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", - " (29): ReLU(inplace=True)\n", - " (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", - " )\n", - " (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))\n", - " (classifier): Sequential(\n", - " (0): Linear(in_features=25088, out_features=4096, bias=True)\n", - " (1): ReLU(inplace=True)\n", - " (2): Dropout(p=0.5, inplace=False)\n", - " (3): Linear(in_features=4096, out_features=4096, bias=True)\n", - " (4): ReLU(inplace=True)\n", - " (5): Dropout(p=0.5, inplace=False)\n", - " (6): Linear(in_features=4096, out_features=1000, bias=True)\n", - " )\n", - " )\n", - " (classifier): Sequential(\n", - " (0): Linear(in_features=1000, out_features=2, bias=True)\n", - " )\n", - ")\n" - ] - } - ], - "source": [ - "# device configuration\n", - "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n", - "# load trained torch model\n", - "model = VGG16_pretrained(num_classes=2).to(device)\n", - "model.load(input_fpath=cons.torch_model_pt_fpath)\n", - "# print model summary\n", - "print(model)" - ] - }, - { - "cell_type": "markdown", - "id": "b81cdf44-b643-400e-8f14-556384ba9ad0", - "metadata": {}, - "source": [ - "## Model Performance" - ] - }, - { - "cell_type": "markdown", - "id": "f771965b-b74f-4363-aa95-242038dcd235", - "metadata": {}, - "source": [ - "The model was trained across 4 epochs. The model accuracy and loss are plotted below across the training and validation sets." - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "52070b1f-4162-4bb8-a082-534a7b858335", - "metadata": {}, - "source": [ - "![Model Accuaracy](../report/torch/model_accuracy.png)" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "c9d58fe1-fce7-4754-a35a-10a7325b7bdd", - "metadata": {}, - "source": [ - "![Model Loss](../report/torch/model_loss.png)" - ] - }, - { - "cell_type": "markdown", - "id": "990e17f5-70dc-4542-8c24-83c52bcc3859", - "metadata": {}, - "source": [ - "## Model Image Predictions" - ] - }, - { - "cell_type": "markdown", - "id": "f307af91-0b9f-486f-bc8d-f8acb3845949", - "metadata": {}, - "source": [ - "The model predictions were made for the Kaggle test set, see below example model predictions." - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "d8856811-a69b-4f4c-a2b6-9d528fcfb75e", - "metadata": {}, - "source": [ - "![Predicted Images](../report/torch/pred_images.jpg)" - ] - }, - { - "cell_type": "markdown", - "id": "ee25817b", - "metadata": {}, - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "catclass", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.7" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/report/torch/VGG16_archecture.png b/report/torch/VGG16_architecture.png similarity index 100% rename from report/torch/VGG16_archecture.png rename to report/torch/VGG16_architecture.png diff --git a/report/torch_analysis_results.ipynb b/report/torch_analysis_results.ipynb new file mode 100644 index 0000000..fca9ada --- /dev/null +++ b/report/torch_analysis_results.ipynb @@ -0,0 +1,172 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "3dd7e904", + "metadata": {}, + "source": [ + "# Cats vs Dogs Image Classification\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "9e032473", + "metadata": {}, + "outputs": [], + "source": [ + "import sys\n", + "\n", + "sys.path.append(\"../model\")\n", + "import cons\n", + "\n", + "import torch\n", + "from model.torch.VGG16_pretrained import VGG16_pretrained" + ] + }, + { + "cell_type": "markdown", + "id": "87a94848", + "metadata": {}, + "source": [ + "This project aims to create a model to classify cat and dog images. The data was sourced from the [dogs-vs-cats](https://www.kaggle.com/competitions/dogs-vs-cats/overview) Kaggle competition, and also from [freeimages.com](https://www.freeimages.com/) using a web scraper. Docker containers were used to deploy the application on an EC2 spot instances in order to scale up hardware and computation power. \n", + "\n", + "## Example Image\n", + "\n", + "![Random Image](torch/random_image.jpg)\n", + "\n", + "## Data Processing\n", + "\n", + "The images were resized to a uniform dimension and the colour channels normalised prior to the modelling training phase. See example image processing below. \n", + "\n", + "![Generator Plot](torch/generator_plot.jpg)\n", + "\n", + "## VGG16 Model Architecture\n", + "\n", + "A pre-trained VGG CNN model with 16 layers was trained using the processed images via PyTorch. See VGG16 diagram below, as well as torch model summary. Stochastic gradient descent was implemented to optimize the training criterion function cross entropy loss.\n", + "\n", + "![AlexNet Architecture](torch/VGG16_architecture.png)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "504c7d94", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "VGG16_pretrained(\n", + " (resnet): VGG(\n", + " (features): Sequential(\n", + " (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (1): ReLU(inplace=True)\n", + " (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (3): ReLU(inplace=True)\n", + " (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", + " (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (6): ReLU(inplace=True)\n", + " (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (8): ReLU(inplace=True)\n", + " (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", + " (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (11): ReLU(inplace=True)\n", + " (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (13): ReLU(inplace=True)\n", + " (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (15): ReLU(inplace=True)\n", + " (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", + " (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (18): ReLU(inplace=True)\n", + " (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (20): ReLU(inplace=True)\n", + " (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (22): ReLU(inplace=True)\n", + " (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", + " (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (25): ReLU(inplace=True)\n", + " (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (27): ReLU(inplace=True)\n", + " (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", + " (29): ReLU(inplace=True)\n", + " (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", + " )\n", + " (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))\n", + " (classifier): Sequential(\n", + " (0): Linear(in_features=25088, out_features=4096, bias=True)\n", + " (1): ReLU(inplace=True)\n", + " (2): Dropout(p=0.5, inplace=False)\n", + " (3): Linear(in_features=4096, out_features=4096, bias=True)\n", + " (4): ReLU(inplace=True)\n", + " (5): Dropout(p=0.5, inplace=False)\n", + " (6): Linear(in_features=4096, out_features=1000, bias=True)\n", + " )\n", + " )\n", + " (classifier): Sequential(\n", + " (0): Linear(in_features=1000, out_features=2, bias=True)\n", + " )\n", + ")\n" + ] + } + ], + "source": [ + "# device configuration\n", + "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n", + "# load trained torch model\n", + "model = VGG16_pretrained(num_classes=2).to(device)\n", + "model.load(input_fpath=cons.torch_model_pt_fpath)\n", + "# print model summary\n", + "print(model)" + ] + }, + { + "cell_type": "markdown", + "id": "51886ec7", + "metadata": {}, + "source": [ + "## Model Performance\n", + "\n", + "The model was trained across 10 epochs. Learning rate reduction on plateau and early stopping were implemented as part of training procedure. The model accuracy and loss are plotted below across the training and validation sets.\n", + "\n", + "![Model Accuracy](torch/model_accuracy.png)\n", + "\n", + "![Model Loss](torch/model_loss.png)\n", + "\n", + "## Model Image Predictions\n", + "\n", + "The model predictions were made for the Kaggle test set, see below example model predictions.\n", + "\n", + "![Predicted Images](torch/pred_images.jpg)" + ] + }, + { + "cell_type": "markdown", + "id": "0c7419e8", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "catclass", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.8" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/report/torch_analysis_results.qmd b/report/torch_analysis_results.qmd new file mode 100644 index 0000000..28c8aab --- /dev/null +++ b/report/torch_analysis_results.qmd @@ -0,0 +1,66 @@ +--- +title: "Torch Analysis Results" +format: + html: + toc: true + toc-location: left + toc-depth: 2 + toc-title: Contents + code-fold: false + echo: false +jupyter: python3 +--- + +# Cats vs Dogs Image Classification + +```{python} +import sys + +sys.path.append("../model") +import cons + +import torch +from model.torch.VGG16_pretrained import VGG16_pretrained +``` + +This project aims to create a model to classify cat and dog images. The data was sourced from the [dogs-vs-cats](https://www.kaggle.com/competitions/dogs-vs-cats/overview) Kaggle competition, and also from [freeimages.com](https://www.freeimages.com/) using a web scraper. Docker containers were used to deploy the application on an EC2 spot instances in order to scale up hardware and computation power. + +## Example Image + +![Random Image](torch/random_image.jpg) + +## Data Processing + +The images were resized to a uniform dimension and the colour channels normalised prior to the modelling training phase. See example image processing below. + +![Generator Plot](torch/generator_plot.jpg) + +## VGG16 Model Architecture + +A pre-trained VGG CNN model with 16 layers was trained using the processed images via PyTorch. See VGG16 diagram below, as well as torch model summary. Stochastic gradient descent was implemented to optimize the training criterion function cross entropy loss. + +![AlexNet Architecture](torch/VGG16_architecture.png) + +```{python} +# device configuration +device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') +# load trained torch model +model = VGG16_pretrained(num_classes=2).to(device) +model.load(input_fpath=cons.torch_model_pt_fpath) +# print model summary +print(model) +``` + +## Model Performance + +The model was trained across 10 epochs. Learning rate reduction on plateau and early stopping were implemented as part of training procedure. The model accuracy and loss are plotted below across the training and validation sets. + +![Model Accuracy](torch/model_accuracy.png) + +![Model Loss](torch/model_loss.png) + +## Model Image Predictions + +The model predictions were made for the Kaggle test set, see below example model predictions. + +![Predicted Images](torch/pred_images.jpg)