We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We can extract tags from packages using RAKE. This will require tuning, filtering and more automated filtering. The datalog can be useful.
This requires the following:
In terms of normalisation, we can learn a great deal from lib.rs:
I normalize keywords to kebab-case, except CJK and a few exceptions like "iOS" which looks silly. I had to manage synonyms mostly manually: https://gitlab.com/crates.rs/crates.rs/-/blob/main/data/tag-synonyms.csv Joining adjacent keywords into pairs helps ["data", "structures"] => ["data-structures"]. Each keyword has a weight, and for similarity search I add hidden keywords: https://gitlab.com/crates.rs/crates.rs/-/blob/main/crate_db/src/lib_crate_db.rs#L306 For keyword extraction I take markdown sections into account: https://gitlab.com/crates.rs/crates.rs/-/blob/main/feat_extractor/src/lib.rs#L44 and use only never-seen-before sentences.
https://mastodon.social/@kornel/109508654611639728
The text was updated successfully, but these errors were encountered:
@qw04 will take this
Sorry, something went wrong.
jappeace
No branches or pull requests
We can extract tags from packages using RAKE.
This will require tuning, filtering and more automated filtering. The datalog can be useful.
This requires the following:
In terms of normalisation, we can learn a great deal from lib.rs:
https://mastodon.social/@kornel/109508654611639728
The text was updated successfully, but these errors were encountered: