Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingredients Spellcheck integration into Hunger Games #1052

Open
jeremyarancio opened this issue Oct 3, 2024 · 3 comments
Open

Ingredients Spellcheck integration into Hunger Games #1052

jeremyarancio opened this issue Oct 3, 2024 · 3 comments

Comments

@jeremyarancio
Copy link

Problem

The Ingredients Spellcheck was developed and is now operational in a batch inference mode. Currently, we corrected 10,000 lists of ingredients stored in the Robotoff database as Insights (insight type name: ingredient_spellcheck).

We are working on integrating the predictions into the contributor workflow, but there's no integration in Hunger Games yet.

Proposed solution

It would be awesome to have an integration in Hunger Games to let users validate the Spellcheck predictions.

Mockups

Something like this, but with the possibility of validating or correcting the prediction.

Image

@jeremyarancio
Copy link
Author

Before the official integration into Hunger Games, we released a Spellcheck annotation feature hosted on Hugging Face.
https://huggingface.co/spaces/openfoodfacts/ingredients-spellcheck-annotate

It already got many feedbacks from users:

  • Annotators should be able to validate, correct or skip the prediction. "Correct" means modifying the prediction text.
  • Differences between the original text and the model predictions should be highlighted to show the deletions, additions and replaced -characters. The highlight and the text modification should be fused for better clarity.
  • Clear explanations of what is expected from users and the configuration of the feature:

-You are provided the original list of ingredients text as stored in the Open Food Facts (OFF) database, the Spellcheck prediction, and optionally a picture of the product.
-Your task, if you accept 💣, is to review the Spellcheck prediction by either validating or correcting it.
-The picture is only here to help you during the annotation as a reference. It can happen that the language of the text and the picture are different. Keep calm and focus on the text.
-It can happen that the Producer has made a mistake on the product packaging. Since we parse the list of ingredients to extract its information, it would be preferable if you fix the typo.
-Deleted whitespaces are indicated as # and additional whitespaces are indicated as ^.

  • What if the Producer has mistaken an ingredient, should the annotator annotate it? => YES
  • A link to the product page can help the annotation
  • A language filter would be highly appreciated

@alexfauquette
Copy link
Member

@jeremyarancio How do you pick the product to annotate in your hugging face interface?

I started to an ingredient extraction game few months ago but kind of drop because it took around 30s to fetch product with pictures but missing text

@jeremyarancio
Copy link
Author

@jeremyarancio How do you pick the product to annotate in your hugging face interface?

I started to an ingredient extraction game few months ago but kind of drop because it took around 30s to fetch product with pictures but missing text

Hey Alex !

There are currently <5000 products that were corrected by our Spellcheck, stored as insights in the Robotoff database.

I used api/insights/random from robotoff to query a product insight, then use the PO api coupled with the product barcode to get the image url.

I then used the Robotoff api/insights/annotate and the specific integration I built for Ingredients Spellcheck to update the product list of ingredients.

And voilà!

You'll find all the code in the scripts behind the démo:

https://huggingface.co/spaces/openfoodfacts/ingredients-spellcheck-annotate/tree/main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To Discuss & Validate
Development

No branches or pull requests

3 participants