-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
robot diff
always shows all axioms have changed for blank nodes
#1243
Comments
@paulmillar can you provide sample input that shows the issue? |
@balhoff Thanks for the quick reply. Here's a simple example that illustrates the problem: @prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<https://metadata.example.org/2025/test>
a owl:Ontology;
dcterms:creator [
foaf:name "Fred Bloggs";
];
rdfs:comment "An example of a problem.";
. Here's an example of paul@monkeywrench:~$ robot diff --left test.ttl --right test.ttl
2 axioms in left ontology but not in right ontology:
- Annotation(<http://purl.org/dc/terms/creator> _:genid2147483648)
- AnnotationAssertion(<http://xmlns.com/foaf/0.1/name> _:genid2147483648 "Fred Bloggs")
2 axioms in right ontology but not in left ontology:
+ Annotation(<http://purl.org/dc/terms/creator> _:genid2147483649)
+ AnnotationAssertion(<http://xmlns.com/foaf/0.1/name> _:genid2147483649 "Fred Bloggs")
paul@monkeywrench:~$ I realise that I forgot to give the version of robot I'm using. It's robot v1.8.1: paul@monkeywrench:~$ robot --version
ROBOT version 1.8.1
paul@monkeywrench:~$ v1.8.1 is pretty old, so I downloaded the latest version, which is currently v1.9.7. I was able to reproduced the problem with that version: paul@monkeywrench:~$ java -jar ~/Downloads/robot.jar --version
ROBOT version 1.9.7
paul@monkeywrench:~$ java -jar ~/Downloads/robot.jar diff --left test.ttl --right test.ttl
2 axioms in left ontology but not in right ontology:
- Annotation(<http://purl.org/dc/terms/creator> _:genid2147483648)
- AnnotationAssertion(<http://xmlns.com/foaf/0.1/name> _:genid2147483648 "Fred Bloggs"^^xsd:string)
2 axioms in right ontology but not in left ontology:
+ Annotation(<http://purl.org/dc/terms/creator> _:genid2147483649)
+ AnnotationAssertion(<http://xmlns.com/foaf/0.1/name> _:genid2147483649 "Fred Bloggs"^^xsd:string)
paul@monkeywrench:~$ |
Thanks @paulmillar, the example is helpful since I wanted to make sure you were dealing with an anonymous individual, and not blank nodes related to the RDF representation of class expressions. I think this is basically the same as #1032. As @jamesaoverton noted there I guess it turns into a graph isomorphism problem, which is tricky. We could add an option to simply exclude axioms involving anonymous individuals from diffs, which isn't very satisfactory, or else try to come up with something more clever. Here is how OWLAPI parses that ontology: Prefix(:=<https://metadata.example.org/2025/test#>)
Prefix(owl:=<http://www.w3.org/2002/07/owl#>)
Prefix(rdf:=<http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
Prefix(xml:=<http://www.w3.org/XML/1998/namespace>)
Prefix(xsd:=<http://www.w3.org/2001/XMLSchema#>)
Prefix(foaf:=<http://xmlns.com/foaf/0.1/>)
Prefix(rdfs:=<http://www.w3.org/2000/01/rdf-schema#>)
Prefix(dcterms:=<http://purl.org/dc/terms/>)
Ontology(<https://metadata.example.org/2025/test>
Annotation(dcterms:creator _:genid2147483648)
Annotation(rdfs:comment "An example of a problem.")
Declaration(AnnotationProperty(dcterms:creator))
Declaration(AnnotationProperty(foaf:name))
AnnotationAssertion(foaf:name _:genid2147483648 "Fred Bloggs")
) |
Hi @balhoff , As you might have already guess, the example was made-up, something simple that demonstrates the problem. For reference, the real file is here: While thinking about this, one (perhaps obvious) idea was to try to make the algorithm for generating the blank nodes' IRIs more deterministic. If the IRI for a blank node were the same (across some change to the ontology that leaves the blank node unmodified) then One possible way to be more deterministic might be to take all predicate-object pairs for axioms with the blank node as the subject, sort them, hash the result and use this hash to generate the blank node's IRI. Naturally, if a blank node were to have an axiom with a blank node as the object then generating the "parent" blank node's IRI would need to be deferred until the "child" blank node's IRI was generated. Since such blank nodes can't be referenced, they should form a simple graph, and the IRI generation should work following a simple depth-first algorithm. There is the possibility of collisions: two blank nodes with the same set of axioms. This would need to be checked and accounted for (e.g., breaking the symmetry using document order, and adding a counter as a suffix to the IRI). That said, I guess most blank nodes will contain a unique set of axioms, so this outcome is unlikely. While not perfect, it would be relatively simple and (I think) it would allow That said, the output if a blank node has changed would be rather sub-optimal: the derived IRI would change, so |
I'd like to use
robot diff
to get a summary of the changes in a GitHub pull request, or during development.For the PR, (in essence) the script run
robot
on the input files fromHEAD
andHEAD^
to generate two outputs, and then runrobot diff
on these two output.Unfortunately, there are blank nodes, which are assigned random IRIs by the
robot diff
command. These blank node IRIs are different between the two versions, leading to a large number of "false positives", whererobot diff
has identified changed assertions that don't reflect changes in the input.This looks a lot like #1032.
As a primitive work-around, I can filter out these generated IDs using
grep
; e.g.,However, using
grep
results in confusing output: therobot diff
command lists the number of axioms that are present in one side that are missing from the other:Unfortunately, this axiom count doesn't match the number of axioms that are listed.
Also, filtering the output would result in the output missing any real changes involving a blank node.
Somewhat ironically, I'm actually getter better results from converting the ontology to ttl and using the
diff
command.The text was updated successfully, but these errors were encountered: