Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support rdfs:domain and rdfs:range in generated schema by import-rdfs #154

Open
jo-fra opened this issue Jan 7, 2025 · 9 comments
Open
Labels
enhancement New feature or request

Comments

@jo-fra
Copy link

jo-fra commented Jan 7, 2025

When generating a LinkML schema using schemauto import-rdfs the resulting LinkML schema does not incorporate the rdfs:domain and rdfs:range definitions.

E.g. the following excerpt from FOAF:

###  http://xmlns.com/foaf/0.1/knows
foaf:knows rdf:type owl:ObjectProperty ;
           rdfs:domain foaf:Person ;
           rdfs:range foaf:Person ;
           rdfs:comment "A person known by this person (indicating some level of reciprocated interaction between the parties)." ;
           rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> ;
           rdfs:label "knows" .

###  http://xmlns.com/foaf/0.1/Person
foaf:Person rdf:type owl:Class ;
            rdfs:subClassOf <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> ,
                            foaf:Agent ;
            owl:disjointWith foaf:Project ;
            rdfs:comment "A person." ;
            rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> ;
            rdfs:label "Person" .

results in the the following LinkML schema:

slots:
  knows:
    comments:
    - A person known by this person (indicating some level of reciprocated interaction
      between the parties).
    slot_uri: foaf:knows

classes:
  Person:
    comments:
    - A person.
    is_a: Agent
    class_uri: foaf:Person

I would expect that rdfs:domain and rdfs:range of foaf:knows property is incorporated like:

slots:
  knows:
    comments:
    - A person known by this person (indicating some level of reciprocated interaction
      between the parties).
    slot_uri: foaf:knows
    range: Person

classes:
  Person:
    comments:
    - A person.
    is_a: Agent
    class_uri: foaf:Person
    slots: 
    - knows
@jo-fra jo-fra added the enhancement New feature or request label Jan 7, 2025
@multimeric
Copy link

I believe I've fixed this in #152. Can you please test that branch to let me know if it resolves your issue?

@jo-fra
Copy link
Author

jo-fra commented Jan 24, 2025

@multimeric
I tested it with the latest commit (cfe7e15) of #152 and with that I do not get the expected results. However, looking at the diff it shows that you reverted some changes in schema_automator/importers/rdfs_import_engine.py when merging from #151, e.g.:

cfe7e15#diff-b6464c40227100611b000caf1086c351935db40bcf0f78ca606ca68d94e5ca3fL37-L43:

<<<<<<< HEAD
    "domain_of": [HTTP_SDO.domainIncludes, SDO.domainIncludes, RDFS.domain],
    "range": [HTTP_SDO.rangeIncludes, SDO.rangeIncludes, RDFS.range],
=======
    "domain_of": [HTTP_SDO.domainIncludes, SDO.domainIncludes],
    "rangeIncludes": [HTTP_SDO.rangeIncludes, SDO.rangeIncludes],
>>>>>>> cleanup-deps

After testing it with commit 65869ba before the merge rdfs:domain and rdfs:range are incorporated expected!

Was the revert of that changes unintended?

Just two caveats:

  1. the generated schema has default_prefix: example but it is not defined in prefixes:

    prefixes:
      linkml: https://w3id.org/linkml/
      dc: http://purl.org/dc/elements/1.1/
      vs: http://www.w3.org/2003/06/sw-vocab-status/ns#
      owl: http://www.w3.org/2002/07/owl#
      wot: http://xmlns.com/wot/0.1/
      foaf: http://xmlns.com/foaf/0.1/
      rdfs: http://www.w3.org/2000/01/rdf-schema#
    default_prefix: example
  2. All datatype properties with rdfs:range rdfs:Literal are generated with range: Literal e.g.

    slots:
      jabberID:
        comments:
        - A jabber ID for something.
        slot_uri: foaf:jabberID
        range: Literal

    However Literal is unrecognized and I am getting this error when trying to run gen-python with this schema:

    gen-python foaf_schema.yaml
    ValueError: File "foaf_schema.yaml", line 21, col 12 slot: jabberID - unrecognized range (Literal)

@multimeric
Copy link

Thanks for the report. It probably was just a faulty merge. I'll likely fix it early next week.

@multimeric
Copy link

Okay, I've rebased and hopefully fixed the underlying issue.

@jo-fra
Copy link
Author

jo-fra commented Jan 31, 2025

@multimeric Thanks, I tried it with the latest commit and it includes now rdfs:domain and rdfs:range.

Only this two issues still persist:

  1. the generated schema has default_prefix: example but it is not defined in prefixes:

    prefixes:
      linkml: https://w3id.org/linkml/
      dc: http://purl.org/dc/elements/1.1/
      vs: http://www.w3.org/2003/06/sw-vocab-status/ns#
      owl: http://www.w3.org/2002/07/owl#
      wot: http://xmlns.com/wot/0.1/
      foaf: http://xmlns.com/foaf/0.1/
      rdfs: http://www.w3.org/2000/01/rdf-schema#
    default_prefix: example
    
  2. All datatype properties with rdfs:range rdfs:Literal are generated with range: Literal e.g.

    slots:
      jabberID:
        comments:
        - A jabber ID for something.
        slot_uri: foaf:jabberID
        range: Literal
    

    However Literal is unrecognized and I am getting this error when trying to run gen-python with this schema:

    gen-python foaf_schema.yaml
    ValueError: File "foaf_schema.yaml", line 21, col 12 slot: jabberID - unrecognized range (Literal)
    

@multimeric
Copy link

Hmm, I can't replicate this Literal issue. If I schemauto import-rdfs on the following ttl:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

foaf:knows rdf:type owl:ObjectProperty ;
           rdfs:domain foaf:Person ;
           rdfs:range foaf:Person ;
           rdfs:comment "A person known by this person (indicating some level of reciprocated interaction between the parties)." ;
           rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> ;
           rdfs:label "knows" .

foaf:Person rdf:type owl:Class ;
            rdfs:subClassOf <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> ,
                            foaf:Agent ;
            owl:disjointWith foaf:Project ;
            rdfs:comment "A person." ;
            rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> ;
            rdfs:label "Person" .

I get:

name: example
id: http://example.org/example
imports:
- linkml:types
prefixes:
  linkml: https://w3id.org/linkml/
  foaf: http://xmlns.com/foaf/0.1/
default_prefix: example
default_range: string
slots:
  knows:
    comments:
    - A person known by this person (indicating some level of reciprocated interaction
      between the parties).
    slot_uri: foaf:knows
    range: Person
classes:
  Agent:
    class_uri: foaf:Agent
  SpatialThing:
    class_uri: ns1:SpatialThing
  Person:
    comments:
    - A person.
    is_a: Agent
    slots:
    - knows
    class_uri: foaf:Person

@multimeric
Copy link

You're right that the default prefix is messed up, and I think I need some input from the maintainers on what to do about that, but to be honest you should always pass in a name and model_uri. The schema won't make much sense otherwise. So for foaf you would do something like:

poetry run schemauto import-rdfs --format xml http://xmlns.com/foaf/spec/index.rdf --schema-name foaf --model-uri http://xmlns.com/foaf/0.1/`

@sierra-moxon
Copy link
Member

@multimeric @jo-fra - agree the default 'example' is confusing and we may just want to get rid of that in the automated step. But agree with @multimeric that passing in an actual value here is a great standard practice. We tend to think of schema-automator as a bootstrapping tool, that users will interact with to get them most of the way towards a working schema, but that they will have to edit to add finishing touches.
Schema-automator is getting so much better with these fixes; thank you!

@multimeric
Copy link

Okay I've just pushed a new change. Firstly, it removes the custom example default in the RDFS importer in favour of letting the SchemaBuilder handle it. Secondly, it tries to infer the schema metadata from RDF. Basically if the name is not provided explicitly, the most common prefix it finds becomes the name. If the id is not explicitly provided, then the corresponding URI becomes the ID. So for FOAF it would determine that foaf is used a ton in the document and therefore schema.name = "foaf" and schema.id = http://xmlns.com/foaf/0.1/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants