Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use persistent identifiers #131

Open
jschaeff opened this issue Feb 1, 2022 · 6 comments
Open

Use persistent identifiers #131

jschaeff opened this issue Feb 1, 2022 · 6 comments
Labels

Comments

@jschaeff
Copy link
Contributor

jschaeff commented Feb 1, 2022

The way agencies are referenced in the stationXML is cumbersome and implies a lot of repetition
Today we would use some unique identifier to reference them, like the ROR (I see that FDSN already shows the ROR of the institutes in the networks details pages).

The suggestion is to find a way to reference an agency by it's ROR.

What about :

<Agency>
    <Identifier type="ROR">035a68863</Identifier>
</Agency>

And make all the rest optional ?

@crotwell
Copy link
Collaborator

crotwell commented Feb 1, 2022

Do you have a link for the definition of ROR?

Currently Agency is just text, like: <Agency>USGS</Agency>. Can you explain what you mean by cumbersome, not sure I understand? Did you mean contact info?

Adding the ROR identifier seems like good idea, similar to how DOI is used in Network. Current use of text would have to change as the inner text and new element would be mixed together, ie this kind of looks bad:

<Agency>
    <Identifier type="ROR">035a68863</Identifier>
    USGS
</Agency>

@fabienengels
Copy link

I would suggest a slightly different approach :

<Agency source="ROR" source_id="035a68863">USGS</Agency>

It's a common pattern you meet in various CS standard.
My two cents :)

@jschaeff
Copy link
Contributor Author

jschaeff commented Feb 2, 2022

@crotwell about being cumbersome, yes it's the contact info that seems too detailed, involves a lot of repetitions, and maintenance when information gets outdated. Contacts could also have their persistent identifiers ?

<Author pid="https://orcid.org/0000-0003-2125-060X"/>

Or also, in order to be even more explicit:

<Operator>
  <Agency pid="https://ror.org/035a68863">USGS</Agency>
  <Contact pid="https://orcid.org/0000-0003-2125-060X">
      <!-- Everything inside contact is made optional if pid already has the information 
             But it should still be possible to give all the derails in the stationXML
      -->
  </Contact>
</Operator>

@fabienengels do you know why the form source="ror" source_id="123456" is better than using a full URL like href=https://ror.org/035a68863 or pid=ror:123456 or pid=orcid:123456 or pid=doi:123456 ?

Using a full PID or PID URL could also be an option to replace <Identifier type="DOI"/> markups.

@jschaeff jschaeff changed the title Modernize Agency references Use persistent identifiers Feb 2, 2022
@fabienengels
Copy link

fabienengels commented Feb 2, 2022

It avoids to multiplying tags and to in the present case, to mix text with tag as in the example given by @crotwell .

XML don't forbid it but it's not part of the best practices as it could me the XML more difficult to read and some parsers can fail to parse it (even it's quite uncommon these days).

Another idea behind it I think, it's that your identifier is not an object by itself but more a metadata/attribute of the Agency object.

Most the standard I know, store the identifier as attribute :

<!-- html -->
<div id="foo"></div>

<!-- quakeml -->
<magnitude publicID="smi:franceseisme.fr/magnitude/1292911">

<!-- SVG -->
<rect id="smallRect" x="10" y="10" width="100" height="100" />

edit : fix SVG example

@fabienengels
Copy link

@jschaeff my point was about storing the identifier as an attribute instead of a tag to be clear :)

I have no issue to use url as identifier :)

@jschaeff
Copy link
Contributor Author

Also, RDA proposes PID for instruments : https://datascience.codata.org/articles/10.5334/dsj-2020-018/

It's still very fresh, but we could provide the ability in stationXML to specify a PID for instruments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants