This repository contains the code to train a multi-label category classifier for Open Food Facts products.
It works within Robotoff, currently using product_name and ingredients_tags.
download.sh
can help you download extra data.
experiments/Train.ipynb
is the notebook to train the model. It can last for several hours depending on your machine.
Threshold.ipynb notebook tries to measure performance to understand where to set threshold for automatic classification. Results are summarized in 2021-10-15-kulizhsy-category-classifier-performance.pdf
A Data for Good to add more features to the model has been initiated. You can find things to help with on issue What can I work on ?
On first install, or requirements or Dockerfile changes (or in case of doubt), run:
docker-compose build
Then simply run:
docker-compose up
Look at url displayed on console to open the notebook
-
The output of training should be published on Robotoff models as a release.
-
The deployment from Robotoff models releases is already automated, see robotoff .github/workflows/container-deploy-ml.yml.
You will have to add a ml-xxx tag to trigger deploy
If you want to develop, here is a sample install using virtual envs.
Install needed dependencies:
On ubuntu :
sudo apt install python3-venv python3-dev build-essential
Create a virtual environment: python3 -m venv .venv
Activate the virtual environment (you will have to activate every time you use the project):
. .venv/bin/activate
Install requirement and eventually requirement-dev
pip install -r requirements.txt
pip install -r requirements-dev.txt
To launch jupyter notebook, just use (after activating your virtual env):
jupyter notebook