Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace SPARQL queries and OWLTools by a ROBOT plugin #1174

Merged
merged 11 commits into from
Feb 15, 2025
Merged

Conversation

gouttegd
Copy link
Contributor

@gouttegd gouttegd commented Feb 3, 2025

This PR is intended to experiment using a ROBOT plugin to replace both

(1) SPARQL queries (as suggested in #1169), and

(2) OWLTools (still used in standard workflows for two things: normalising a OBO source file, and creating subsets -- #622).

For now, this is using my own “experimental ROBOT plugin”. If we are happy with the experiment, we can then create a proper ODK plugin for ROBOT (and/or push some of the features in upstream ROBOT).

Install my experimental ROBOT plugin as a built-in plugin under the name
"odk". This is for experimentation only -- I use this plugin to trial
the use of pluggable commands in the ODK workflows.

If we go on with that route, we will create a dedicated ODK plugin
later.
When preparing import modules, we do a few things:

(1) add a dc:source ontology annotation, derived from the version IRI of
the original ontology;
(2) remove all other ontology annotations, keeping only the newly added
dc:source;
(3) inject proper SubAnnotationPropertyOf axioms for properties
representing subsets and synonym types.

All those steps are currently performed by SPARQL queries. Here we
replace those queries by calls to the `odk:annotate` command, which
takes care of (1) and (2), and to the `odk:normalize` command, which
takes care of (3).

Of note, the fact that we are no longer going through a SPARQL
processing step means that we could end up with duplicated axioms with
different sets of annotations. Those were automatically merged as a
side-effect of the SPARQL processing (which involves dumping the output
of the SPARQL processing and re-parsing it again into OWLAPI objects).
Since we no longer benefit from that side-effect, we must explicitly
include a step in which we merge duplicated axioms (theoretically this
could be done with `robot repair --merge-axiom-annotations`, but
unfortunately this command does not behave exactly like we would [1]).

[1] ontodev/robot#1239
We are still using OWLTools for two things:

(1) creating ontology subsets;
(2) merging duplicated axioms in the source file.

Those tasks can now be done by the `odk:subset` command and the
`odk:normalize` command, respectively.
The inject-subset-declaration.ru and inject-synonymtype-declaration.ru
SPARQL queries are no longer used in any standard workflows.
Now that the standard workflows no longer use OWLTools, there is no
longer any need for OWLTools to be present in ODKLite (whose purpose is
to contain all the tools needed by the standard workflows, and only
those tools). We thus move it to ODKFull.
@gouttegd gouttegd marked this pull request as draft February 3, 2025 20:10
@gouttegd gouttegd self-assigned this Feb 3, 2025
@gouttegd gouttegd added this to the 1.6 milestone Feb 3, 2025
By default, the `odk:subset` does _not_ send the generated subset down
the ROBOT pipeline, unless the `--replace true` option is used.
gouttegd and others added 3 commits February 4, 2025 23:24
We add a new option in the 'robot_report' section called
'upper_ontology'. If set, it should be the (resolvable) IRI of an upper
ontology (such as http://purl.obolibrary.org/obo/cob.owl).

When set, a a new report is added to the list of ROBOT report, one that
tests whether all classes of the ontology are classified under one of
the classes of the upper ontology. The new report uses the same
parameters as the standard ROBOT reports regarding the file to perform
the check on ('report_on') and whether the check should be limited to
classes within the project's namespaces or not ('use_base_iris').

See #1175
This commit fixes several issues with the generated Makefile rule that
performs the alignment check:

* Only include the aligment report when an upper ontology is defined.
* When asked to perform the check on the -edit file, actually perform it
  on the $(SRCMERGED) file (for consistency with other reports).
* Use the reasoner defined in the project, if any.
* Fix formatting so that the generated rules are somewhat readable.
The status of the `project.robot_report.upper_ontology` field cannot
simply be tested with either 'is defined' or 'is not none', because it
will depend on whether a `robot_report` section exists at all:

* without a `robot_report` section, the field is *defined* but is None
  (default value);
* with a `robot_report` section but no `upper_ontology` field, the field
  is *not defined*.

So to cover both cases, we need to test both for the existence of the
field, and whether it is None or not.
@gouttegd gouttegd marked this pull request as ready for review February 5, 2025 16:44
@gouttegd gouttegd requested a review from matentzn February 5, 2025 16:45
Copy link
Contributor

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is sooooo awesome. I have a few questions (in code) and 1 general concern.

Is there any way you would agree to move the ODK robot plugin into the INCATools org? I would feel a bit better if that component that is shaping up to be a core component of the ODK build system would live a bit more visibly here..

@gouttegd
Copy link
Contributor Author

gouttegd commented Feb 8, 2025

Is there any way you would agree to move the ODK robot plugin into the INCATools org?

There is currently no ODK ROBOT plugin. This PR is using my experimental plugin (emphasis on experimental), which nobody should ever think about using seriously in a production pipeline. That plugin is basically my playground, where I have fun test various ideas that may or may not turn out to be good ideas.

When there will be a ROBOT plugin (using consolidated ideas from my experimental one), it could of course be hosted somewhere in the INCATools organisation. Though its Java namespaces would likely still be somewhere under org.incenp.obofoundry, since the “INCATools organisation” is merely a GitHub thing that does not have its own domain name.

@gouttegd
Copy link
Contributor Author

gouttegd commented Feb 8, 2025

There is currently no ODK ROBOT plugin.

Well, this is no longer true. :) The “ODK ROBOT plugin” exists now. Derived from my experimental plugin, but containing only the commands that are deemed useful for the ODK, not the more dubious experimental ones.

It is in my own account for now, but happy to transfer its ownership to the INCATools organisation if you prefer.

Replace my experimental ROBOT plugin by a proper ODK plugin.

This involves modifying the calls to some commands, as the consolidated
ODK plugin use slightly different options compared to the original
experimental plugin.
@gouttegd gouttegd changed the title Experiment with a ODK plugin for ROBOT Replace SPARQL queries and OWLTools by a ROBOT plugin Feb 11, 2025
@gouttegd
Copy link
Contributor Author

@matentzn Any objection to this PR?

I tested it on both Uberon and CL without noticing any problems. (Didn’t test on FlyBase ontologies, since the most consequential change by far is in the way we produce subsets, and FlyBase ontologies do not define subsets.)

Copy link
Contributor

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic! Looks great.

I am approving this but under a the tiny reservation that the exact nature of the odk:validate command is still a bit in flux (i.e. we would label this feature as "experimental" if we were to make a release tomorrow)

@gouttegd
Copy link
Contributor Author

the tiny reservation that the exact nature of the odk:validate command is still a bit in flux

Fine with me. Now that the skeleton for the validation workflow is there, it can of course be updated at any time.

@gouttegd gouttegd merged commit 167f9cc into master Feb 15, 2025
1 check passed
@gouttegd gouttegd deleted the use-odk-plugin branch February 15, 2025 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants