Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvest attribution roles #3149

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

maudetes
Copy link
Contributor

@maudetes maudetes commented Sep 18, 2024

Part of datagouv/data.gouv.fr#1588

  • contact_points is a now list of contacts
    • migration included
  • a contact point has a role (contact, publisher, etc.)
  • harvest dct:publisher, dct:creators, dct:contributors.
  • only expose one dct:publisher, so if there's a contact point with the role publisher, the owner should be prov:qualifiedAttribution instead
  • expose dct:publisher, dct:creators, dct:contributors in RDF (with correct properties depending on role)
  • do we want a dedicated attribution field instead of contact point? -> not for now
  • make udata-front resilient with contact points list
  • update and add tests
  • play around to add a suggest API on these contacts
  • add a changelog

Support FOAF.name for dct:publisher, dct:creators, dct:contributors
@maudetes maudetes marked this pull request as draft September 18, 2024 13:47
maudetes added a commit to datagouv/udata-front that referenced this pull request Sep 18, 2024
@maudetes maudetes changed the title Harvest responsible roles in extras Harvest attribution roles Dec 10, 2024
@magopian magopian force-pushed the feat/add-roles-in-harvest-extras branch from 609150c to e200393 Compare December 16, 2024 15:39
…services when generating the fixtures

Also auto-set a role on the contact point if there is none
- if email or contact_form present: `contact`
- else `creator`
Copy link
Contributor Author

@maudetes maudetes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the update work and added tests, fixtures & co!

udata/harvest/tests/dcat/evian.json Outdated Show resolved Hide resolved
udata/commands/fixtures.py Show resolved Hide resolved
udata/commands/fixtures.py Outdated Show resolved Hide resolved
udata/commands/fixtures.py Outdated Show resolved Hide resolved
udata/commands/fixtures.py Outdated Show resolved Hide resolved
udata/core/contact_point/models.py Outdated Show resolved Hide resolved
if contact_point.role == "publisher"
]:
# There's already a publisher, so the owner should instead be a qualified attribution.
owner_role = PROV.qualified_attribution
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qualifiedAttribution points towards an Attribution and not directly an agent I think.
See maybe this section in W3C DCAT as well as the DCAT-AP one and finally the GeoDCAT-AP one.

I am not 100% clear on the best predicate to use for an organization if we have a dedicated publisher contact point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, having gone through the links you provided, here's a non-exhaustive list of possible roles:

What do you think? Did I miss any? Do you have a favorite?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go for geodcat:distributor for now? Since it's the organization distributing the data on data.gouv.fr, I would say it's the closest role.

udata/core/dataset/rdf.py Outdated Show resolved Hide resolved
udata/tests/test_rdf.py Outdated Show resolved Hide resolved
udata/tests/test_rdf.py Outdated Show resolved Hide resolved
@magopian
Copy link
Contributor

Thanks you very much @maudetes for the detailed review, I believe i've addressed all your concerns, or answered with comments, would you mind having another look to validate that my changes are in line with what you were thinking, and letting me know what you think of the remaining things to decide?

@magopian magopian marked this pull request as ready for review January 13, 2025 15:05
Copy link
Contributor Author

@maudetes maudetes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

The CI is failing for now, which prevents me from deploying on our test environment :)

udata/rdf.py Outdated Show resolved Hide resolved
udata/rdf.py Outdated Show resolved Hide resolved
@@ -43,7 +43,11 @@ def dataservice_from_rdf(
dataservice.base_api_url = url_from_rdf(d, DCAT.endpointURL)
dataservice.endpoint_description_url = url_from_rdf(d, DCAT.endpointDescription)

dataservice.contact_point = contact_point_from_rdf(d, dataservice) or dataservice.contact_point
# TODO: what are the type of contact points supported on dataservices?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can go with this indeed :) Not sure we need the list(), I think a generator would do?

if contact_point.role == "publisher"
]:
# There's already a publisher, so the owner should instead be a qualified attribution.
owner_role = PROV.qualified_attribution
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go for geodcat:distributor for now? Since it's the organization distributing the data on data.gouv.fr, I would say it's the closest role.

magopian added a commit to opendatateam/udata-fixtures that referenced this pull request Jan 14, 2025
magopian added a commit to opendatateam/udata-fixtures that referenced this pull request Jan 14, 2025
* `contact_point` is now renamed to `contact_points`, a list

* Also import "rpg" dataset fixture

* Update the fixtures without nesting the dataset in reuses, discussions, resources and dataservices

* Update the results.json with the latest changes in opendatateam/udata#3149
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants