Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

label suggestions .. data visibility #314

Open
dobkeratops opened this issue Aug 1, 2024 · 0 comments
Open

label suggestions .. data visibility #314

dobkeratops opened this issue Aug 1, 2024 · 0 comments

Comments

@dobkeratops
Copy link

so there's a long tail of annotations that visitors wont see because they're in non-productive labels currently.

would you have time every so often to add some of the most annotated label suggestions to the productive label list (the labels that everyone sees, e.g. on the stats screen)?

it may indeed take too long to validate all of them (20,000 suggestions vs 546 productive labels currently) but adding say another 10 every so often might help
there's things like (man|woman)/(sitting|walking|running|...) .. left|right parts of other animals .. tabletop , some combinations of material ("wooden tabletop") .. new parts ("forearm") .. clothing types

it might be possible to streamline this by looking for "/" combinations of already approved words.. you might be able to just add a few more and bulk enable a few this way.

there are some mixtures of "_" and " " .. it might be possible to automatically make these synonyms ("wooden tabletop" = "wooden_tabletop") .. the rationale of _ was a grouping hint if the phrases were used in sentances or with "/" combination (e.g. "a b/c" vs "a_b/c" .. the latter is less ambiguous) .. but it hasn't been used consistently.

there's probably some good NLP neural nets around by now that could turn phrases into a useable embedding vector that could be trained on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant