CORD-19 Search is built on Vespa Cloud:
- Use the Query API to search.
- Use the left frame to navigate / drill down / refine the query - implemented using Vespa Grouping
- From the article view, find similar/related articles - implemented using SCIBERT-NLI embeddings.
The application is implemented as a Vespa Cloud sample application. Refer to experiment yourself to try out different rank profiles, including ML models.
Find the frontend code in this repo in src/App.
Use the scite.ai dataset citations, and add these by matching the DOI to those in the CORD-19 dataset.
Also plan to merge these citations with citations we find from the dataset by matching a bibliography reference's title to titles in the data set. (We find roughly 3x more citations that way, but we have no way to extract sentiment (supporting, contradicting).
We also need some way to extract the citation passage, i.e. the sentence(s) containing the reference.