-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For discussion - add identifier scheme for pids #299
Comments
I don't believe we need to have such a At HZB, I use the convention to always add a scheme prefix separated by a colon into the >>> query = Query(client, "DataPublication", conditions={"pid": "= 'DOI:10.5442/ND000006'"}, includes=["fundingReferences.funding", "relatedItems", "users.affiliations"])
>>> client.assertedSearch(query)[0]
(dataPublication){
# …
description = "…"
fundingReferences[] =
(dataPublicationFunding){
# …
funding =
(fundingReference){
# …
awardNumber = "ExNet-0042-Phase-2-3"
funderIdentifier = "Crossref Funder ID:10.13039/501100001656"
funderName = "Helmholtz Association"
}
},
(dataPublicationFunding){
# …
funding =
(fundingReference){
# …
awardNumber = ":unas"
funderName = "Helmholtz Einstein International Berlin Research School in Data Science (HEIBRiDS)"
}
},
(dataPublicationFunding){
# …
funding =
(fundingReference){
# …
awardNumber = "0324247"
funderIdentifier = "Crossref Funder ID:10.13039/501100006360"
funderName = "Federal Ministry for Economic Affairs and Energy"
}
},
pid = "DOI:10.5442/ND000006"
publicationDate = 2021-06-28 00:00:00+02:00
relatedItems[] =
(relatedItem){
# …
fullReference = "Cariou, Romain et al. III–V-on-silicon solar cells reaching 33% photoconversion efficiency in two-terminal configuration. Nat Energy 3, 326–333 (2018). https://doi.org/10.1038/s41560-018-0125-0"
identifier = "DOI:10.1038/s41560-018-0125-0"
relatedItemType = "JournalArticle"
relationType = "Cites"
title = "III–V-on-silicon solar cells reaching 33% photoconversion efficiency in two-terminal configuration"
},
(relatedItem){
# …
fullReference = "Bläsi, Benedikt et al. Photonic structures for III-V//Si multijunction solar cells with efficiency >33%. Proc. SPIE 10688, Photonics for Solar Energy Systems VII, 1068803 (2018). https://doi.org/10.1117/12.2307831"
identifier = "DOI:10.1117/12.2307831"
relatedItemType = "JournalArticle"
relationType = "Cites"
title = "Photonic structures for III-V//Si multijunction solar cells with efficiency >33%"
},
(relatedItem){
# …
fullReference = "Tillmann, Peter et al (2021): Optimizing metal grating back reflectors for III-V-on-silicon multijunction solar cells. Optics Express. https://doi.org/10.1364/OE.426761"
identifier = "DOI:10.1364/OE.426761"
relatedItemType = "JournalArticle"
relationType = "IsSupplementTo"
title = "Optimizing metal grating back reflectors for III-V-on-silicon multijunction solar cells"
},
(relatedItem){
# …
fullReference = "Tillmann, Peter et al (2021): Optimizing metal grating back reflectors for III-V-on-silicon multijunction solar cells. Zenodo. https://doi.org/10.5281/zenodo.5013230"
identifier = "DOI:10.5281/zenodo.5013230"
relatedItemType = "Software"
relationType = "IsReferencedBy"
title = "Optimizing metal grating back reflectors for III-V-on-silicon multijunction solar cells"
},
subject = "multi-junction solar cell; optical simulations; finite element method; light trapping; light management; nanotextures; metal grating"
title = "Optimizing metal grating back reflectors for III-V-on-silicon multijunction solar cells"
users[] =
# …
(dataPublicationUser){
# …
affiliations[] =
(affiliation){
# …
fullReference = "JCMwave GmbH, Bolivarallee 22, 14050 Berlin"
name = "01: JCMwave"
},
(affiliation){
# …
fullReference = "Computational Nano Optics, Zuse Institute Berlin, Takustraße 7, 14195 Berlin"
name = "02: ZIB"
pid = "ROR:02eva5865"
},
contributorType = "Creator"
familyName = "Hammerschmidt"
fullName = "Hammerschmidt, Martin"
givenName = "Martin"
orderKey = "004"
},
(dataPublicationUser){
# …
affiliations[] =
(affiliation){
# …
fullReference = "Optics for Solar Energy, Helmholtz-Zentrum Berlin für Materialien und Energie, Albert-Einstein-Straße 16, 12489 Berlin"
name = "01: HZB"
pid = "ROR:02aj13c28"
},
(affiliation){
# …
fullReference = "Computational Nano Optics, Zuse Institute Berlin, Takustraße 7, 14195 Berlin"
name = "02: ZIB"
pid = "ROR:02eva5865"
},
contributorType = "Creator"
familyName = "Tillmann"
fullName = "Tillmann, Peter"
givenName = "Peter"
orderKey = "001"
},
(dataPublicationUser){
# …
affiliations[] =
(affiliation){
# …
fullReference = "Fraunhofer Institute for Solar Energy Systems ISE, Heidenhofstr. 2, 79110 Freiburg, Germany"
name = "01: Fraunhofer ISE"
pid = "ROR:02kfzvh91"
},
contributorType = "Creator"
familyName = "Bläsi"
fullName = "Bläsi, Benedikt"
givenName = "Benedikt"
orderKey = "002"
},
(dataPublicationUser){
# …
affiliations[] =
(affiliation){
# …
fullReference = "JCMwave GmbH, Bolivarallee 22, 14050 Berlin"
name = "01: JCMwave"
},
(affiliation){
# …
fullReference = "Computational Nano Optics, Zuse Institute Berlin, Takustraße 7, 14195 Berlin"
name = "02: ZIB"
pid = "ROR:02eva5865"
},
contributorType = "Creator"
familyName = "Burger"
fullName = "Burger, Sven"
givenName = "Sven"
orderKey = "003"
},
} As you can see, I have mutliple different types of PIDs in the data: DOIs, Crossref Funder IDs, and RORs in this case. Note that the Crossref Funder IDs are actually DOIs, but still handled separately. The script that generates the landing pages has a helper class to deal with that: class PID:
"""Generalization of a persistent identifier.
"""
SchemeURI = {
"DOI": "https://doi.org/",
"arXiv": "https://arxiv.org/abs/",
"ORCID": "https://orcid.org/",
"ROR": "https://ror.org/",
"Crossref Funder ID": "https://doi.org/",
"PaNET": "http://purl.org/pan-science/PaNET/",
"URL": "",
}
def __init__(self, identifier, scheme=None):
# Unless the scheme is overridden, this code assumes the
# identifier to be scheme and id separated by a colon and that
# the scheme part does not contain a colon.
if scheme:
self._type, self._id = scheme, identifier
else:
self._type, self._id = identifier.split(':', maxsplit=1)
if self._type not in self.SchemeURI:
raise ValueError("%s: unknown identifier type" % identifier)
@property
def identifierType(self):
return self._type
@property
def identifier(self):
return self._id
@property
def schemeURI(self):
return self.SchemeURI[self._type] or None
@property
def url(self):
return self.SchemeURI[self._type] + self._id This helper is able to deal properly with all different types and cases: >>> p = PID("Crossref Funder ID:10.13039/501100001656")
>>> p.identifierType
'Crossref Funder ID'
>>> p.identifier
'10.13039/501100001656'
>>> p.schemeURI
'https://doi.org/'
>>> p.url
'https://doi.org/10.13039/501100001656'
>>> p = PID("DOI:10.5442/ND000006")
>>> p.identifierType
'DOI'
>>> p.identifier
'10.5442/ND000006'
>>> p.schemeURI
'https://doi.org/'
>>> p.url
'https://doi.org/10.5442/ND000006'
>>> p = PID("ROR:02eva5865")
>>> p.identifierType
'ROR'
>>> p.identifier
'02eva5865'
>>> p.schemeURI
'https://ror.org/'
>>> p.url
'https://ror.org/02eva5865' E.g. the snippet for adding if self.relatedItems:
relatedIds = etree.SubElement(datacite, "relatedIdentifiers")
for r in self.relatedItems:
pid = PID(r['identifier'])
rId = etree.SubElement(relatedIds, "relatedIdentifier")
rId.set("relatedIdentifierType", pid.identifierType)
rId.set("relationType", r['relationType'])
rId.text = pid.identifier It works the same for any PID type. |
We have now
pid
fields for the relevant entities. Thesepid
s may be from different schemes - for example, for affiliations we may have ROR or ISNI identifiers. If facilities rely on more than one scheme, it would be useful to include a field for the ``pid_scheme``` being used.The text was updated successfully, but these errors were encountered: