Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchical (or simply linked/inherited) metadata #9

Open
MichaelClerx opened this issue Jun 5, 2023 · 2 comments
Open

Hierarchical (or simply linked/inherited) metadata #9

MichaelClerx opened this issue Jun 5, 2023 · 2 comments

Comments

@MichaelClerx
Copy link

Hi! Thanks for the cool site. A few years ago, at a standards-unification conference for biological data, we looked at csv-on-the-web as something we should all adopt. A lot of our data was from repeated experiments, so we started talking about a hierarchical version, i.e. you would:

  1. Create a directory (on a disk, at a URL, in a zip file etc.) with some meta data file indicating it was a special "resource"
  2. Have a meta data file for this directory (e.g. saying "lab = ...") and further meta data files for subdirectories ("experiment type = ..", "cell_type = ...", "temperature = ..."), so that each subdirectory could either add to or overwrite parent directory meta data fields
  3. Finally have the CSV meta data, which "inherits" all the data from the subdirectory it's stored in

Do you know if there have been any efforts like this? Or some other mechanism to achieve similar goals? (I.e. a field in the json that says "please also include all of the stuff at this URI")?

Thanks in advance, sorry for abusing the issue system.

@MichaelClerx MichaelClerx changed the title Hierarchical metadata Hierarchical (or simply linked/inherited) metadata Jun 6, 2023
@MichaelClerx
Copy link
Author

Or some other mechanism to achieve similar goals? (I.e. a field in the json that says "please also include all of the stuff at this URI")?

To make it a more general question: Is there any mechanism to import meta data from another document? (So not necessarily a tree structure)

@Robsteranium
Copy link
Contributor

Hello!

The spec allows for object properties which can either be objects or references to URLs where the object definition may be found. This allows you to re-use metadata across tables, for example:

experiment-1.csv.json:

{
  "url": "experiment-1.csv",
   "tableSchema": "experiment-schema.json"
}

experiment-2.csv.json:

{
  "url": "experiment-2.csv",
  "tableSchema": "experiment-schema.json"
}

Or equivalently as a table group:

experiments.json:

{
  "tableSchema": "experiment-schema.json",
  "tables": [
    { "url": "experiment-1.csv" },
    { "url": "experiment-2.csv" }
  ]
}

The spec doesn't define how (or whether) metadata should be merged when the user provides overriding metadata but there are inherited properties which allow you to override column specification defaults provided at e.g. the table group level with values for a specific table.

Judging by your examples, you may be thinking about your own metadata properties and not ones from the CSVW metadata vocabulary. In which case you might be able to use provisions from the JSON-LD spec to achieve what you want. Just bear in mind that, despite the syntactic overlap, CSVW processors only need to support a subset of JSON-LD; crucially this means the context is fixed. You may need to have bespoke processing if you go down this route.

You might also think about a pre-processing tool that generates CSVW annotations. This is what we did with Swirrl/table2qb - we use a registry of columns to generate a table-schema suitable for a given CSV table. This let's us avoid repetition in the "source of truth" and still have spec-compliant outputs.

FWIW, I don't think this is really an abuse of the issues system. If you come up with a CSVW solution it'd be great to hear about on this thread and maybe even see it become a guide for csvw.org so others can learn from the example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants