Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Check if if resource has rdf:type #87

Open
edroege opened this issue Apr 1, 2014 · 12 comments
Open

Proposal: Check if if resource has rdf:type #87

edroege opened this issue Apr 1, 2014 · 12 comments

Comments

@edroege
Copy link
Member

edroege commented Apr 1, 2014

The validator should check if individuals (like http://data.dm2e.eu/data/item/mpiwg/harriot/MPIWG_0HE26A22_00108 ) have an rdf:type (like e.g. http://www.europeana.eu/schemas/edm/ProvidedCHO ).

This should be checked for all classes: CHO and Aggregation but also for contextual classes like Agents, Places, Timespans etc.

If no class is indicated, give a warning.

@ksdm2e
Copy link

ksdm2e commented Apr 1, 2014

how is it possible to have no rdf:type? Could you give an example?

@kba
Copy link
Member

kba commented Apr 1, 2014

<http://agg1> edm:aggregatedCHO <http://cho1>

implies that

  • http://agg1 is an ore:Aggregation
  • http://cho1 is an edm:ProvidedCHO

without making it explicit. What @edroege proposes means that the graph should contain those statements:

<http://agg1> rdf:type ore:Aggregation .
<http://cho1> rdf:type edm:ProvidedCHO .
<http://agg1> edm:aggregatedCHO <http://cho1> .

Many providers provide data such as:

<http://agg1> edm:isShownBy <http://webresource> .

without asserting anything about http://webresource or omitting the fact that it is an edm:WebResource.

@edroege
Copy link
Member Author

edroege commented Apr 1, 2014

@kba: Thanks for clarifying!

I think it has to be explicit. In case of Web resources, Aggregations and CHOs we could get this information afterwards because shownBy has only a web resource as a range etc. But in cases where the range is broader, like edm:Agent (with subclasses foaf:Person and foaf:Organization), we cannot get the information if the class edm:Agent, foaf:Person or foaf:Organization was meant.

@d0rg0ld
Copy link
Member

d0rg0ld commented Apr 1, 2014

@ksdm2e

<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Oxford">
        <dc:title>Oxford</dc:title>
        <dc:coverage>Oxfordshire</dc:coverage>
        <dc:publisher>Wikipedia</dc:publisher>
        <region:population>10000</region:population>
        <region:principaltown rdf:resource="http://www.country-regions.fake/oxford"/>
    </rdf:Description>

Valid RDF -> You have no idea what type "http://en.wikipedia.org/wiki/Oxford" has

@ksdm2e
Copy link

ksdm2e commented Apr 1, 2014

Unfortunately, I don't get it. It seems to me to be syntactical issue. When I look at the turtle representation, rdf/xml or even the html-representation of the above mentioned sample, there is always a clear declaration of the class, e.g.:

<http://data.dm2e.eu/data/item/bbaw/dta/16157>
      a       edm:ProvidedCHO ; ...

@d0rg0ld @edroege Where do you find such "orphaned", untyped triples?

@kba kba removed the dta label Apr 1, 2014
@kba
Copy link
Member

kba commented Apr 1, 2014

@ksdm2e
I don't think that this issue is pertinent to DTA data, http://data.dm2e.eu/data/item/bbaw/dta/16157 is indeed not relevant for this issue. I'll change the URL to a fitting example. You are using the variant of RDF/XML that enforces an rdf:type statement by syntax. You might be susceptible to referring to things without asserting anything about them, though.

One place I do notice this issue is in MPIWG/Harriot: http://data.dm2e.eu/data/rdf/resourcemap/mpiwg/harriot/MPIWG_0HE26A22/20140306195409535?output=ttl. Resolving one random page: http://data.dm2e.eu/data/rdf/resourcemap/mpiwg/harriot/MPIWG_0HE26A22_00104/20140306195409535?output=ttl There are no rdf:type statements in there.

@kba kba added the harriot label Apr 1, 2014
@d0rg0ld
Copy link
Member

d0rg0ld commented Apr 1, 2014

@kba yeah it's because they use the xml <rdf:Description rdf:about=way of representing rdf/xml

@kba
Copy link
Member

kba commented Apr 1, 2014

@d0rg0ld Actually, MPIWG delivers N-TRIPLE but the pitfalls are the same.

@d0rg0ld
Copy link
Member

d0rg0ld commented Apr 1, 2014

@kba ok I was referring to an old sample I got months ago ...

@edroege
Copy link
Member Author

edroege commented Apr 7, 2014

@ksdm2e I just took some example URIs to make my point clearer. Your data is fine - sorry if you have thought that there is something wrong. I was suggesting to add a warning or something to the validator, not to correct ingestions.

We will make it clearer in the next revision of the model specification that the mappings should contain classes. Can the validator check during the ingestion if e.g. the class edm:ProvidedCHO occurs as often as the property edm:aggregatedCHO? Or is this too complicated?

@kba
Copy link
Member

kba commented Apr 7, 2014

I will implement the wanted behavior in the validator, check that every object in '?s edm:aggregatedCHO ?object' triple is an (has rdf:type) edm:ProvidedCHO and every object in triple with a WebResource-related predicate (edm:hasView, edm:shownAt, edm:shownBy ...) is a (has rdf:type) edm:WebResource. Will notify once deployed.

kba added a commit to DM2E/dm2e-ontologies that referenced this issue Apr 7, 2014
@kba
Copy link
Member

kba commented Apr 7, 2014

OK, the validator will now emit a WARNING for every subject in a file that has no 'rdf:type' statements. Deployed since build 'Mon Apr 7 23:12:14 CEST 2014', please re-download.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants