add section about 'unstar' mapping #115

pchampin · 2024-11-13T13:59:31Z

as per w3c/rdf-star-wg#129

Preview with the examples working
(as opposed to the automatic preview below)

Preview | Diff

as per w3c/rdf-star-wg#129

afs · 2024-11-13T14:49:40Z

Is this proposed "as well as" the graph-to-graph algorithm or "instead of"?

w3c/rdf-star-wg#114 (comment)

niklasl · 2024-11-13T14:51:35Z

This approach requires graphs containing triple terms to be represented as datasets. That excludes cases where you need to put "unstarred" RDF 1.2 graphs into an RDF 1.1-based quad store and manage them within specific named graphs. Implementations supporting the default graph union mechanism would also treat the "triple term graphs" as asserted in that union graph.

niklasl · 2024-11-13T14:53:47Z

It has been mentioned (or opined) that the "star" name would eventually go away (it would be just RDF 1.2 with triple terms). If so, perhaps "unstar" is an unfortunate name for future reference?

rat10 · 2024-11-13T23:55:28Z

There seems to be an issue with the examples section: all examples say "Cannot GET /uploads/dcqFS6/spec/ex-unstar-output.trig".

gkellogg · 2024-11-14T00:55:08Z

There seems to be an issue with the examples section: all examples say "Cannot GET /uploads/dcqFS6/spec/ex-unstar-output.trig".

It's the general PR-preview issue of not being able to retrieve neighboring resources. They are fleshed out if you look at the GitHack version.

pchampin · 2024-11-14T15:34:11Z

Is this proposed "as well as" the graph-to-graph algorithm or "instead of"?

I personally don't think that we should have multiple such mappings, and I am more and more convinced that the graph-to-graph approach makes more sense.

The reasons I stuck to my initial graph-to-dataset approach in this PR are that

I am still not clear about the details of the graph-to-graph approach, and
I wanted to write down the design goals of the mapping, so that we can discuss them (and inform the responses to the previous point).

rat10 · 2024-11-14T16:32:13Z

What is the "graph-to-graph" algorithm? A mapping based on the RDF standard reification vocabulary?

niklasl · 2024-11-14T18:21:43Z

What is the "graph-to-graph" algorithm? A mapping based on the RDF standard reification vocabulary?

Or something isomorphic to it but using dedicated terms. There's an example of that in this recent wiki page (with links to w3c/rdf-semantics#49 and w3c/rdf-star-wg#114.).

afs · 2024-11-22T15:28:20Z

Semantic Task Force 2024-11-22

We are looking at the "graph" flavor of unstar.

afs · 2024-11-22T15:41:47Z

Design goals: (content from the PR)

Information preserving
It must be possible to reconstruct the input dataset from the output dataset.
Note that, on the other hand, the algorithm is not designed to be semantics preserving:
the graphs in the produced dataset are not semantically equivalent to their corresponding graph in the input dataset.
Idempotent
Transforming a dataset that is already complying with RDF Classic (i.e. containing no triple term) must result in the same dataset.
Universal
It should be possible to transform any RDF Full dataset using this method.

- unstarring a graph now produce a graph (not a dataset) - it uses the reification vocabulary (with a distinctinve type rdf:UnstarredTripleTerm)

pchampin · 2024-11-29T14:28:46Z

I just updated the PR; the new algorithm is "graph-to-graph", repurposing the reification vocabulary.
Note that I coined rdf:UnstarredTripleTerm to type the generated blank nodes.

Note that I deliberately chose a very specific name for this, to distinguish it from the type we will probably introduce as the class of all triple terms, for example, to describe the range of rdf:reifies (say, rdf:Triple). Indeed, I believe that there will be valid use-cases to use that type (rdf:Triple) in Fuill RDF graphs, while rdf:UnstarredTripleTerm should really be considered as "reserved" and not to be used outside of the "unstar" algorithm.

afs · 2024-11-29T17:12:16Z

This is a suggestion related to presentation only.

RDF-concepts defines RDF and is also a readable document.

The algorithms of unstar/restar are good for defining the translation but algorithms do not communicate the broad intent so well.

Maybe: have all the written description, discussion and examples, then have the algorithms.

Put the overview "The general principle" at the beginning of section 8
Pull the examples into section 8.
Either push all algorithms to later sections within section 8 or put the algorithms as normative appendixes.

rat10 · 2024-12-05T15:38:54Z

There seems to be an issue with the examples section: all examples say "Cannot GET /uploads/dcqFS6/spec/ex-unstar-output.trig".

It's the general PR-preview issue of not being able to retrieve neighboring resources. They are fleshed out if you look at the GitHack version.

@gkellogg How did you create that link? I'm trying to read the new PR but again am unable to read the included examples.

gkellogg · 2024-12-05T15:45:27Z

There seems to be an issue with the examples section: all examples say "Cannot GET /uploads/dcqFS6/spec/ex-unstar-output.trig".

It's the general PR-preview issue of not being able to retrieve neighboring resources. They are fleshed out if you look at the GitHack version.

@gkellogg How did you create that link? I'm trying to read the new PR but again am unable to read the included examples.

Select spec/index.html in the branch you want to see and enter it in http://raw.githack.com/. It gives you a link to the rendered version.

niklasl · 2024-12-05T16:23:07Z

I think this approach is good.

The use of a dedicated type for unstarred triple terms seems prudent. If used, an rdf:UnstarredTripleTerm rdfs:subClassOf rdf:TripleTerm axiom should be defined (possibly where w3c/rdf-semantics#49 is defined, if it will be). This since the unstarred form is useful as input to e.g. OWL reasoners without full RDF 1.2 support, but users of that should only rely on the rdf:TripleTerm type, not this special subclass. This is because when such tooling is updated to RDF 1.2, the type of triple terms will just be rdf:TripleTerm, and any OWL-based axioms should still work.

Whether the constituent triple term predicates should be reused from the reification vocabulary or not has been debated some (see e.g. w3c/rdf-semantics#49 (comment)). It depends on whether or not it makes sense to make rdf:TripleTerm a subclass of rdf:Statement.

I think the name rdf:TripleTerm has been used most recently, but it has perhaps not been finalized (I am somewhat in favor of rdf:Triple if there is room for debate).

But these details, about naming and which constituent triple term predicates to use, can probably be dealt with separately, to avoid blocking this PR.

The question about whether to use name "unstar" at all remains (as "RDF-star" is not mentioned as such in RDF 1.2; only in reference to the RDF-star WG).

rat10 · 2024-12-05T16:56:42Z

I just updated the PR; the new algorithm is "graph-to-graph", repurposing the reification vocabulary. Note that I coined rdf:UnstarredTripleTerm to type the generated blank nodes.

Note that I deliberately chose a very specific name for this, to distinguish it from the type we will probably introduce as the class of all triple terms, for example, to describe the range of rdf:reifies (say, rdf:Triple). Indeed, I believe that there will be valid use-cases to use that type (rdf:Triple) in Fuill RDF graphs, while rdf:UnstarredTripleTerm should really be considered as "reserved" and not to be used outside of the "unstar" algorithm.

To me it seems like the discussions in the Semantics TF and in Github issues, e.g. rdf-semantics issue #49 and rdf-star-wg issue #130, moved away from re-using the RDF reification vocabulary. The reason is a bit intricate: the RDF standard reification describes an occurrence/instance/reification of a triple. The triple term describes a triple and only the reification, indicated by rdf:reifies, creates a reference to the instance/occurrence/reification. That means that in the following example _:r and _:s are semantically equivalent, but the triple term and the reification quad are not.

_:r rdf:reifies <<( :s :p :o )>> .
_:s a rdf:Statement ;
    rdf:subject :s ;
    rdf:predicate :p ;
    rdf:object :o .

Given this difference in meaning I think it's more prudent to not reuse the properties from the reification vocabulary but to mint new ones, like e.g. rdf:tripleTermSubject, etc.

W.r.t. other wordings:

how about rdf:unTripleTerm and rdf:reTripleTerm?
I favor rdf:TripleTerm over rdf:Triple to refer to a triple term, since the latter could be easily misunderstood to refer to regular, asserted RDF triples.

pchampin · 2024-12-06T09:42:19Z

Regarding the name of the algorithm,

I guess we could go for "classicize" (as it converts to RDF "classic").
"flatten" would also seem appropriate, but could create confusion with the algorithm of the same name in JSON-LD? The context (no pun intended) is quite different so I'm not convinced this would be a real issue.

pchampin · 2024-12-06T10:00:37Z

Regarding the vocabulary,

I think that duplicating the properties rdf:subject, rdf:predicate, rdf:object could also create a lot of confusion, so I would refrain from doing that unless repurposing them really breaks something badly. But since rdf:Statement is so loosely defined, I don't think that would be the case. I am happy to consider that, in retrospect, rdf:Statement can include both the "platonic triples" denoted by triple terms and the "occurrences" denoted by reifiers.

rat10 · 2024-12-06T12:50:14Z

Regarding the vocabulary,

I think that duplicating the properties rdf:subject, rdf:predicate, rdf:object could also create a lot of confusion, so I would refrain from doing that unless repurposing them really breaks something badly. But since rdf:Statement is so loosely defined, I don't think that would be the case. I am happy to consider that, in retrospect, rdf:Statement can include both the "platonic triples" denoted by triple terms and the "occurrences" denoted by reifiers.

The WG seems to tend towards allowing the usage of triple terms in a more general way than just as a source of reifiers. In that context it is probably prudent to differentiate the two concepts, and not mix and mingle them. Also, I don't see how rdf:Statement is loosely defined. The naming may be a bit vague, but the definition IMO is not.

rat10 · 2024-12-10T12:04:55Z

Regarding the vocabulary,
I think that duplicating the properties rdf:subject, rdf:predicate, rdf:object could also create a lot of confusion, so I would refrain from doing that unless repurposing them really breaks something badly. But since rdf:Statement is so loosely defined, I don't think that would be the case. I am happy to consider that, in retrospect, rdf:Statement can include both the "platonic triples" denoted by triple terms and the "occurrences" denoted by reifiers.

The WG seems to tend towards allowing the usage of triple terms in a more general way than just as a source of reifiers. In that context it is probably prudent to differentiate the two concepts, and not mix and mingle them. Also, I don't see how rdf:Statement is loosely defined. The naming may be a bit vague, but the definition IMO is not.

On second (or rather n-th) thought I'd like to add something.

Following recent discussions in the Semantics TF (as captured here and discussed there) we might settle for calling rdf:Statement the act of stating a triple (without of course saying if that stating actually happened, and where), and calling rdf:Proposition the abstract triple as described by an RDF-star triple term. Both are composed of subject, predicate and object, so ...I again tend to agree with you. .

However, doesn't this jeopardize backwards compatability? So far it's possible to infer that an entity is of type rdf:Statement if it's the subject of an rdf:subject|predicate|object statement. From anecdotal evidence I gather that it's common practice to omit the type declaration and reduce triple count by one when using the RDF standard reification vocabulary. That practice becomes unsound as soon as unstar-ed triple terms enter the mix.

To work around this problem, we could stress that any application of the unstar mapping should refrain from applying the same optimization. Also, an unassuming RDF 1.1 environment would not be led into a completely wrong direction if it assumed that the immediate subject of an unstar operation - _:gen1 in your proposed Example 3 - represents an RDF standard reification. It doesn't since it would miss multi-part reifications, but maybe it's close enough.

On the other hand, backwards compatability is what the unstar mapping is all about, so why jeopardize it?

pchampin · 2024-12-10T17:15:10Z

To work around this problem, we could stress that any application of the unstar mapping should refrain from applying the same optimization.

Applying the unstar mapping means following the algorithm. If you don't follow it exactly, in particular if you don't add the triple (b, rdf:type, rdf:UnstarredTripleTerm), then it is not the unstar mapping anymore, and you may lose the properties listed at the top of Section 8 ("information preserving", "idempotent", "universal") can not be guaranteed...

However, doesn't this jeopardize backwards compatability?

I do not suggest to change the domain of rdf:subject and co.... That would indeed jeopardize backwards compatibility.

So yes, it would mean that any bnode generated by the unstar mapping could be inferred to be of type rdf:Statement, but I don't think this is a problem. The important thing is that it is also of type rdf:UnstarredTripleTerm, which is what should matter in that situation.

Regarding the notion of rdf:Proposition discussed by the Semantics TF: I don't think that the rdf:UnstarTripleTerm class (used to encode triple terms) needs to be related to rdf:Proposition (used to define the semantics of triple terms). But even if I did: according to WordReference, several definitions of the term (and all of those related to logic or mathematics) define it as "a statement ...". This makes "proposition" a subclass of "statement", which is fine by me, and allows us to use rdf:subject and friends to describe instances of rdf:Proposition.

rat10 · 2024-12-10T21:21:42Z

To work around this problem, we could stress that any application of the unstar mapping should refrain from applying the same optimization.

Applying the unstar mapping means following the algorithm. If you don't follow it exactly, in particular if you don't add the triple (b, rdf:type, rdf:UnstarredTripleTerm), then it is not the unstar mapping anymore, and you may lose the properties listed at the top of Section 8 ("information preserving", "idempotent", "universal") can not be guaranteed...

Fair enough.

However, doesn't this jeopardize backwards compatability?

I do not suggest to change the domain of rdf:subject and co.... That would indeed jeopardize backwards compatibility.

So yes, it would mean that any bnode generated by the unstar mapping could be inferred to be of type rdf:Statement, but I don't think this is a problem. The important thing is that it is also of type rdf:UnstarredTripleTerm, which is what should matter in that situation.

I discussed the "unassuming RDF 1.1 environment", and that is what matters to backwards compatability. In such an environment, rdf:subject etc would be assumed to refer to an RDF standard reification - which is exactly not what we want. Such an environment would probably not even be aware of the existence of the type rdf:UnstarredTripleTerm, let alone check for it.

Regarding the notion of rdf:Proposition discussed by the Semantics TF: I don't think that the rdf:UnstarTripleTerm class (used to encode triple terms) needs to be related to rdf:Proposition (used to define the semantics of triple terms). But even if I did: according to WordReference, several definitions of the term (and all of those related to logic or mathematics) define it as "a statement ...". This makes "proposition" a subclass of "statement", which is fine by me, and allows us to use rdf:subject and friends to describe instances of rdf:Proposition.

I disagree. No matter what the term "statement" means, the term rdf:Statement has a very specific meaning, and an RDF-star triple term is not a specialization of it, but arguably rather the other way round.

pchampin · 2024-12-11T06:58:54Z

No matter what the term "statement" means, the term rdf:Statement has a very specific meaning, and an RDF-star triple term is not a specialization of it,

I stand corrected; re-reading the related sections in RDF Semantics, it says "The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object" (emphasis is mine). I will change my PR accordingly.

also, the algo 'quote-triple-term' was renamed, because it was not actually "quoting" the triple.

@afs

following @afs, the two transformations ('classicize' and revert) are now primarily described in prose, the algorithm being secundary. We now also describe the reverse transformation.

rat10 · 2024-12-12T13:08:29Z

W.r.t. vocabulary I’m still not convinced: defining a new classicize namespace creates its own issues. I find this more confusing then defining specialized properties like rdf:tripleTermSubject, etc. Also, doesn't this introduces not only 3 new predicates, but also a classicize:TripleTerm in addition to an rdf:TripleTerm?

spec/index.html

TallTed · 2024-12-20T18:33:52Z

spec/index.html

+    <dd>It should be possible to transform any [=Full=] graph (resp. dataset) to a [=Classic=] graph (resp. dataset) using this method.
+      There is actually <a href="#section-classicize-caveat">a minor caveat</a> to this property.


Suggested change

<dd>It should be possible to transform any [=Full=] graph (resp. dataset) to a [=Classic=] graph (resp. dataset) using this method.

There is actually <a href="#section-classicize-caveat">a minor caveat</a> to this property.

<dd>It should be possible (with <a href="#section-classicize-caveat">a minor caveat</a>) to transform any [=Full=] graph (resp. dataset) to a [=Classic=] graph (resp. dataset) using this method.

I'd rather not merge these two sentences: the first one describes the design goal (in the abstract). The second one is about the proposed solution.

spec/index.html

TallTed

I've reviewed spec/index.html up to the Algorithms. I'll come back for that. There are a number of requested changes above. I didn't think I'd find so many, nor that I'd have this much time to do so, or I'd have bundled them into a review... Sorry for the extra clicks these will take to apply!

rat10 · 2025-01-09T15:53:34Z

@pchampin

Questions w.r.t. naming (the properties, basic vs classic) IMHO simply arose because the PR strays from agreed upon terminology and should be rolled back until properly discussed (e.g. by raising an issue about basic vs classic).

As I explained above, the current version of the spec, on which this PR is based, defined "Full conformance" and "Classic conformance", so the PR sticks to this and refers to "Classic conformance". Using "Basic" in section 8 while section 2 says "Classic" would have been inconsistent and confusing.

Note that I personally don't have a strong preference between "basic" and "classic". It is not that I refuse to make the change, but I don't think that this PR should "casually" change a normatively defined term, when its purpose is somewhere else.

I understand - although I differ w.r.t. "casually" and "normatively". However, I'd rather say that the mistake should not be made worse by repeating it but rather pointed out where it was made. Also, we had the discussion a year ago and I don't want to repeat it unnecessarily, but I do indeed like the pair "Basic" and "Full" much more than "Classic" and "Full". On an unrelated note I meanwhile would argue to not have that distinction at all, but it's probably too late for that.

rat10 · 2025-01-09T15:59:15Z

@pchampin

My comment w.r.t.the most important issue - how triple terms relate to RDF standard refication - hasn't been properly discussed. So: no, I'm still not okay with merging this PR.

I just added a note about why we introduce another vocabulary rather than reuse the old reification vocabulary.

Thank you, that addresses my respective concern.

For more general discussion about the relation between triple terms and old-style reification, as I responded earlier: I agree that it is needed, but this should be in rdf-semantics and/or rdf-primer. And anyway, this is orthogonal to this PR: the relationship between triple terms and old-style reification is about triple terms in general, not just triple terms being "classicized".

So, do you propose to open a new issue on that topic?

niklasl · 2025-01-09T16:46:48Z

@rat10

So, do you propose to open a new issue on that topic?

We have w3c/rdf-semantics#61 .

rat10 · 2025-01-09T17:02:26Z

@niklasl

@rat10

So, do you propose to open a new issue on that topic?

We have w3c/rdf-semantics#61 .

Okay, I wasn't aware. The tag says 'editorial'. What does that mean?

Co-authored-by: Ted Thibodeau Jr <[email protected]>

this makes Respec unhappy, because 'appears in' is currently not defined, but including the definition will be for another PR. This keeps this PR more "localized".

add section about 'unstar' mapping

8e07ca3

as per w3c/rdf-star-wg#129

pchampin added 2 commits November 29, 2024 15:11

change the unstar algorithm so that

d42fabc

- unstarring a graph now produce a graph (not a dataset) - it uses the reification vocabulary (with a distinctinve type rdf:UnstarredTripleTerm)

add note on the (absence of) interference with old style reification

f5d2deb

pchampin added 3 commits December 11, 2024 08:51

rename the 'unstar' algo to 'classicize'

ec44491

also, the algo 'quote-triple-term' was renamed, because it was not actually "quoting" the triple.

use a dedicated vocab rather than repurposing the reification vocab

10065ac

prioritize prose over algo, and describe revert transformation

91bff0f

following @afs, the two transformations ('classicize' and revert) are now primarily described in prose, the algorithm being secundary. We now also describe the reverse transformation.

TallTed reviewed Dec 20, 2024

View reviewed changes