-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #3 from cmungall/nico-finalise
A bit of updates to the narrative
- Loading branch information
Showing
5 changed files
with
103 additions
and
43 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
cffi==1.15.0 | ||
errorhandler==2.0.1 | ||
git+https://github.com/manubot/manubot@d4242ffa4194e4a13a68c5f6466feff559d3f9d5 | ||
isbnlib==3.10.10 | ||
opentimestamps-client==0.7.1 | ||
opentimestamps==0.4.3 | ||
pandoc-eqnos==2.5.0 | ||
pandoc-fignos==2.4.0 | ||
pandoc-tablenos==2.3.0 | ||
pandoc-xnos==2.5.0 | ||
pandocfilters==1.5.0 | ||
panflute==2.2.3 | ||
psutil==5.9.4 | ||
pybase62==0.5.0 | ||
python-bitcoinlib==0.11.2 | ||
pyyaml==6.0 | ||
papermill | ||
plotly | ||
seaborn | ||
ontogpt==0.2.9 | ||
notebook==7.0.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,18 @@ | ||
## Abstract {.page_break_before} | ||
|
||
Mapping... | ||
Aligning terminological resources, including ontologies, controlled vocabularies and taxonomies, is a critical part of data integration in many domains such as healthcare, chemistry and biomedical research. | ||
|
||
Entity mapping is the process of determining correspondences between entities across these resources, such as gene identifiers, disease concepts or chemical entity identifiers. Many tools have been developed to compute such mappings based on common structural features and lexical information such as labels and synonyms. Lexical approaches in particular often provide very high recall, but low precision, due to lexical ambiguity. | ||
|
||
Large Language Models (LLMs), such as the ones employed by ChatGPT, have generalizable abilities to perform a wide range of tasks, including question answering and | ||
information extraction. | ||
|
||
Here we present *MapperGPT*, an approach based on LLMs to refine and predict mapping relationships | ||
as a post-processing step, that works in | ||
concert with existing high-recall methods that are based on lexical and structural heuristics. | ||
|
||
We evaluated *MapperGPT* on a series of alignment tasks from different domains, including anatomy, developmental | ||
biology, and renal diseases. | ||
We devised a collection of tasks that are designed to be particularly challenging | ||
for lexical methods. We show that when used in combination with high-recall methods | ||
*MapperGPT* can provide a substantial improvement in accuracy beating state of the art methods such as LogMap. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters