Validate ontology relationships using Ubergraph as source of truth. Relationships in this context may be subClassOf axioms between names classes (e.g. 'lymphocyte' subClassOf 'cell') or existential restrictions, (e.g. 'enterocyte' part_of some ‘intestinal epithelium’).
Ubergraph is an RDF triplestore with 39 OBO ontologies merged, precomputed OWL classification and materialised class relationship from existential property restrictions. Validation therefore works for any directly asserted or inferred/indirect subClassOf relationship or existential restriction.
This package depends on Graphviz and OBOGraphviz to represent the validation as a graph.
On macOS:
brew install graphviz
On Linux:
apt install graphviz
For another platform, please follow this instruction to install Graphviz.
Before installing OBOGraphviz, make sure you have installed Node.js version >= 14.16. Please follow this instructions to install Node and npm.
Then install the obographviz
package globally:
npm install -g obographviz
pip install verificado
In the config file, it is defined the list of relationships the validation should run on. The order is essential.
The yaml file needs to have the keys relationships
and filename
. Check an example below:
relationships:
sub_class_of: rdfs:subClassOf
part_of: BFO:0000050
connected_to: RO:0001025
has_soma_location: RO:0002100
...
filename: path/to/filename.csv
The filename can be in TSV or CSV. When using CSV, double-quote if the label contains a common. It's preferred to have the following columns:
s | slabel | user_slabel | o | olabel | user_olabel |
---|---|---|---|---|---|
the subject term ID | the label of the term in the column s | optional label for the term given by user | the object term ID | the label of the term in the column s | optional label for the term given by user |
However, the package can also accept TSV or CSV files representing a hierarchy. You can specify an undetermined number of levels, each level defined with an ontology term ID and the label of the term. Please check an example in the tests directory.
Add to_be_parsed: true
to the yaml file when using this type of file.
relationships:
sub_class_of: rdfs:subClassOf
part_of: BFO:0000050
connected_to: RO:0001025
has_soma_location: RO:0002100
...
filename: path/to/filename.csv
to_be_parsed: true
verificado validate --input path/to/config.yaml --output path/to/output.csv
The output.csv
file will be in the same format as the filename.csv
. It will return the cases where a triple (subject, relationship, object) with the relationships listed in the yaml file was not found in Ubergraph.
To know which ontologies and their version are available in Ubergraph, use the following CLI:
verificado ontologies_version --output filename.json