Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W3C CSV on the web recommendations #984

Closed
timrobertson100 opened this issue Oct 14, 2024 · 5 comments
Closed

W3C CSV on the web recommendations #984

timrobertson100 opened this issue Oct 14, 2024 · 5 comments
Assignees
Labels

Comments

@timrobertson100
Copy link

A good addition to datapackage.org might be a summary of the differences between its approach and W3C CSVW recommendations, along with any other relevant information to help guide decisions on which is best to adopt.

We are looking to evolve our CSV guidelines within the Darwin Core standard and are keen to align with an established framework for table schemas. We’re exploring the differences between the guidelines but would appreciate any thoughts on things like governance, future roadmap, maintenance and the state of tool development that might be also relevant to a decision. We presume others may be in a similar position so thought it worth suggesting to document this on the site.

People from Darwin Core have been involved in both activities; we provided a use case for csvw and @peterdesmet is an active maintainer of both Darwin Core and datapackage.

Thank you very much.

@sapetti9
Copy link
Contributor

Thanks for the suggestion @timrobertson100. We are actually drafting an overview on the differences between CSVW and Data Package and we will add it to the documentation very soon! We'll include the points you suggested on governance, future roadmap, maintenance and the state of tool development. I'll ping you here (and close this issue) once we have the overview up on the website. Great to know about your future plans for table schemas.

@sapetti9
Copy link
Contributor

Hi @timrobertson100, @peterdesmet pushed a PR outlining the difference between DP and CSVW, you can preview the page here: https://csvw.datapackage-6gp.pages.dev/guides/data-package-csvw/

Here are some additional information on the roadmap and adoption:

Technical roadmap
The Data Package (v2) was released on June 26, 2024 (iteration was funded by the European Commission via NLnet). After this important update, the technical roadmap of the project is focusing on general maintenance. Feature-wise, the standard adopted a voting mechanism for the promotion of new additions, meaning that new features may be considered by the Working Group based on users' demand. In-general, one of the core values of the standard is stability: a new version may be released once every year or two.

The main focus of the project for next few years will be in the realm of implementations, integrations, and extensions. As the standard already has a mature foundation in its core specifications and decent software implementations, it will focus on working with key data repositories like Zenodo or CKAN to adopt the standard natively, as well as on improving software implementations and adding new ones, especially visual such as Open Data Editor developed by Open Knowledge Foundation, fully-featured Data Package based editor. Facilitating Data Package extensions is another important direction to the project, as there are already a few very prominent ones like Camera Trap Data Package, and Gapminder DDF. The project is going to simplify the process of creating extensions and end-user usage. The project will also work with different working groups to support the implementation of domain-specific extensions.

Perceived adoption
We are currently discussing with the InvenioRDM team to add a Data Package serializer to Zenodo and other generalist repositories powered by Invenio: inveniosoftware/invenio-rdm-records#1742 (comment)

Among the projects that adopted the Frictionless Data Package standard, here are some notable ones:

  1. BCO-DMO: https://blog.bco-dmo.org/2020/02/09/frictionless-data-pipelines-for-ocean-science
  2. data.world:https://frictionlessdata.io/blog/2017/04/11/dataworld/
  3. Our World in Data: https://github.com/owid/owid-datasets?tab=readme-ov-file#owid-dataset-collection
  4. My Society: https://frictionlessdata.io/blog/2022/09/20/mysociety-workflow/#how-we-re-handling-common-data-analysis-and-data-publishing-tasks
  5. Gapminder: https://open-numbers.github.io/ddf.html
  6. The French administration: https://frictionlessdata.io/blog/2020/05/22/etalab-case-study-schemas-data-gouv-fr/#what-s-a-schema
  7. Dryad: https://www.youtube.com/watch?v=IHVUjWGh2oY
    Note that Dryad does not use Data Package per se, but has adopted frictionless-py, which is based on Data Package rules, for data validation.

Hope this helps. Let us know if you have other questions and what your plans end up being regarding Table Schema.

@timrobertson100
Copy link
Author

Wow, that was fast. Thank you very much

We'll start digesting this. Under the adoption section, you are very welcome to list our repository software; perhaps linking to our news item.

@roll
Copy link
Member

roll commented Oct 28, 2024

Thanks to @peterdesmet huge effort the CSVW comparison has been just published - https://datapackage.org/guides/csvw-data-package/ 🎉

@roll roll closed this as completed Oct 28, 2024
@roll roll added this to the v2.1 milestone Oct 28, 2024
@timrobertson100
Copy link
Author

Thank you all very much for putting this together. It's impressive indeed.

@roll roll removed this from the v2.1 milestone Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants