Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listing the common biological entites, which glida fails to normalize #123

Open
Crispae opened this issue Nov 5, 2023 · 3 comments
Open

Comments

@Crispae
Copy link

Crispae commented Nov 5, 2023

Raw Glida Normalized Correct
AD ADIPOQ (hgnc:13633) Alzheimer Disease (mesh:D000544)
@Crispae
Copy link
Author

Crispae commented Nov 5, 2023

Also need to add column with context.

@bgyori
Copy link
Member

bgyori commented Nov 5, 2023

Hi Crispae, thanks for reporting this, however, this looks like a misunderstanding. "AD" is a highly ambiguous string and it is not possible to determine its sense without additional context.

Actually, Gilda does return both hgnc:13633 and mesh:D000544 in its list of matches for "AD" which is the expected behavior:

In [3]: gilda.ground('AD')
Out[3]: 
[ScoredMatch(Term(ad,AD,HGNC,13633,ADIPOQ,synonym,adeft,None,None,None),0.5555555555555556,Match(query=AD,ref=AD,exact=True,space_mismatch=False,dash_mismatches={},cap_combos=[])),
 ...
ScoredMatch(Term(ad,AD,MESH,D000544,Alzheimer Disease,synonym,adeft,None,None,None),0.5555555555555556,Match(query=AD,ref=AD,exact=True,space_mismatch=False,dash_mismatches={},cap_combos=[])),
 ...

if I provide meaningful additional context, Gilda can detect that the correct sense in that particular context is Alzheimer Disease:

In [7]: gilda.ground('AD', 'AD is a neurodegenrative disease and it is associated with Abeta accumulation.')
INFO: [2023-11-05 18:02:16] gilda.grounder - Running Adeft disambiguation for AD
Out[7]: 
[ScoredMatch(Term(ad,AD,MESH,D000544,Alzheimer Disease,synonym,adeft,None,None,None),0.5555485494219966,Match(query=AD,ref=AD,exact=True,space_mismatch=False,dash_mismatches={},cap_combos=[]),disambiguation={"type": "adeft", "score": 0.9999873889595938, "match": "grounded"}),

@Crispae
Copy link
Author

Crispae commented Nov 6, 2023

Hi Thanks for response, Yes I check that, it can disambiguate. I was wondering to add some more terms related to specific use case. I went through the tutorial, but for each list of term we need to build different grounder, is there anyway we can merge those terms with default terms already merged with glida, so it can ground entity with whole terms list, not from specific list of terms.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants