diff --git a/README.md b/README.md
index c26e222..17de7c5 100644
--- a/README.md
+++ b/README.md
@@ -1,11 +1,11 @@
# VisualFashionAttributePrediction
-Extraction of fashion product attributes based on images from the apparel industry
+Extraction of fashion product attributes based on their images.
![](MISC/header.png)
## About This Project
-This repository implements my solution of the Kaggle iMaterialist Challenge (Fashion) at FCVC5 . The competition's goal was to predict attributes of products from the apparel industry based on images of the products. The products are selected from a variety of domains e.g. shoes, jackets, necklaces and many more and the target features contain information about the product's category, material, appearance and more. For more information please refer to the competition website.
+This repository implements my solution of the Kaggle iMaterialist Challenge (Fashion) at FCVC5 . The competition's goal was to predict attributes of products from the apparel industry based on images of the products. The products are selected from a variety of domains e.g. shoes, jackets, necklaces and many more and the target attributes contain information about the product's category, material, appearance and more. For more information please refer to the competition website.
My personal goal for this competition was to solidify my pytorch skills and to familiarize myself with the pytorch lightning package, an open-source Python library that provides a high-level interface for PyTorch. For this reason, I solely focused on the training of one model (Resnet50) and did not put much effort into stacking several models together as suggested by the winner of the Kaggle competition.
@@ -45,7 +45,13 @@ Predicting the attributes of a product based on it's image allows for matching s
For more information please refer to the jupyter notebook `notebooks/product_matching.ipynb`.
-
+## Requirements:
+- torch 1.4.0
+- pytorch_lightning 1.0.6
+- torchvision 0.5.0
+- Pillow 7.0.0
+- onnx 1.8.0
+- onnxruntime 1.5.2
## QuickStart
The following steps will enable you to use a pretrained model to predict the attributes of a fashion product. You can either use example data provided in this repository or test the model on your own images.
@@ -53,11 +59,12 @@ The following steps will enable you to use a pretrained model to predict the att
- clone the repository
- download some example data by executing the `download_iMaterialistValSet.py`. This will download the validation set of the iMaterialist Dataset. If you want to work with more data, only little changes in `download_iMaterialistValSet.py` and some more files that can be downloaded from the competition website are necessary.
- download the pretrained model weights from here (not included in repo due to quota constraints)
-- open the jupyter notebook `notebooks/score_model.ipynb`
+- download the pretrained model from here
+- open the jupyter notebook `notebooks/attribute_prediction.ipynb`
- follow the notebook instructions
## Model Architecture + Training
-The model architecture consists of a standard ResNet50 body that was pretrained on ImageNet and is provided by pytorch. I replaced the fully connected head by a fully connected output layer of shape (2048[output of resnet_body] x 228[number of different features]). As we have a situation in which multiple features can be right for the same samples, instead of a crossentropy loss, I use a 228-dimensional binary cross entropy loss.I chose to use a standard Adam optimizer with default parameters and leave it to pycharm lightning to take care of the learning rate schedule.
+The model architecture consists of a standard ResNet50 body that was pretrained on ImageNet and is provided by pytorch. I replaced the fully connected head by a fully connected output layer with shape (2048[output of resnet_body] x 228[number of different features]). As we have a situation in which multiple features can be correct for one sample, instead of a crossentropy loss I use a 228-dimensional binary cross entropy loss.I chose to use a standard Adam optimizer with default parameters and leave it to pytorch lightning to take care of the learning rate schedule.
I used random color jitter and random horizontal flipping as image augmentation techniques (provided by torchvision.transforms). The images from the iMaterialist Dataset are augmented, normalized (offset: [0.6765, 0.6347, 0.6207], std=[0.3284, 0.3371, 0.3379]) and resized to (512x512).
@@ -70,4 +77,6 @@ Due to quota limitations on github, I provide trained models on this official pytorch documentation. Please make sure to use normalized RGB images as input for the model (offset: [0.6765, 0.6347, 0.6207], std=[0.3284, 0.3371, 0.3379]), otherwise the model performance may decrease drastically!