Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify D in req/core/identifier #182

Open
amilan17 opened this issue Feb 12, 2024 · 17 comments
Open

Clarify D in req/core/identifier #182

amilan17 opened this issue Feb 12, 2024 · 17 comments
Milestone

Comments

@amilan17
Copy link
Member

|D |The +id+ property shall include a local identifier as defined by the data publisher. The local identifier shall not have spaces or special or accented characters.

The question is what are "special" characters?

@amilan17
Copy link
Member Author

@tomkralidis

@tomkralidis
Copy link
Collaborator

Perhaps we can further qualify with:

  • no spaces
  • no accents
  • no colons (given they are URN separators)
  • none of the following: `~!@#$%^&*()=][{}|'";,.?/+

cc @josusky

@josusky
Copy link
Contributor

josusky commented Feb 13, 2024

This is too restrictive. The regular expression that you have provided is correct only for the "namespace identifier" NID part. But the NID is fixed in our case to wmo. The rest of the URN is "Namespace Specific String" (NSS) and its validation is more benevolent. Original description is in https://www.rfc-editor.org/rfc/rfc2141.html (section 2.2) and is slightly modified (extended) by newer RFC (https://www.rfc-editor.org/rfc/rfc8141). Example of a valid URN is:
urn:example:a123,z456?+abc

@josusky
Copy link
Contributor

josusky commented Feb 13, 2024

I am not deadly against a rule that is more strict than actual URN specification. I looked up the specification because I spotted the innocent dot (.) in Tom's list - that "lifted me off the chair" :-)
I can hardly imagine anyone putting ~ or ] into metadata ID but a dot (.) or slash (/) seem quite OK to me.

@tomkralidis
Copy link
Collaborator

Having a slash (/) in the ID introduces URLs like the following in the GDC:

https://example.org/collections/foo/items/foo%2Fbar

While we can relax the regex set mentioned previously, the above would be error prone.

@tomkralidis tomkralidis mentioned this issue Feb 14, 2024
@amilan17
Copy link
Member Author

amilan17 commented Oct 22, 2024

The definition as approved during PR #183. "The id property SHALL include a local identifier as defined by the data publisher. The local identifier SHALL NOT have spaces or accented characters."

@tomkralidis
Copy link
Collaborator

TT-WISMD 2024-10-22:

  • can we use the ISO charset
  • WTH uses IRA T.50, can we reuse
  • LSP reworded the requirement into "no space" or "no accented characters"

@josusky
Copy link
Contributor

josusky commented Oct 28, 2024

Specifying a character set that does not have accented characters and other things that can complicate the usage of this identifier is a good idea. IRA T.50 is an appropriate choice. Apart from that (and the space), did you discuss some more restrictions during TT-WISMD 2024-10-22?

@amilan17
Copy link
Member Author

The text that will be published is currently:
The id property SHALL include a local identifier as defined by the data publisher. The local identifier SHALL NOT have spaces or accented characters.
Please note, that changes to this text will need to go through an amendment process...

@tomkralidis
Copy link
Collaborator

2024-11-26

@josusky
Copy link
Contributor

josusky commented Nov 27, 2024

@tomkralidis I am ready to discuss. @amilan17 I understand your concern, addition of a "restriction" into the specification, looks like a breaking change. Perhaps we can run a search query on a Global Catalogue, to verify if "special" characters are used. If not, then it would be a proof that the change/restriction will have no impact on the current metadata records - it will only make things clearer for the future.

@tomkralidis
Copy link
Collaborator

FYI below is a current state based on Canada GDC:

$ curl -s "https://wis2-gdc.weather.gc.ca/collections/wis2-discovery-metadata/items?limit=10000" | jq '.features[].id' | sort
"urn:wmo:md:br-inmet:cap"
"urn:wmo:md:ca-eccc-msc:00e10fa6-565f-5af7-95e4-b8946f4150a0"
"urn:wmo:md:ca-eccc-msc:08fd00e4-59db-597f-b0a7-177d305607dc"
"urn:wmo:md:ca-eccc-msc:0a7a49df-e464-5649-a094-33965c0345e5"
"urn:wmo:md:ca-eccc-msc:0e73142b-c4a7-4bf0-8d16-72ac2df60fc1"
"urn:wmo:md:ca-eccc-msc:0fe2f63a-0268-5d22-9228-e53d9f0350b3"
"urn:wmo:md:ca-eccc-msc:13e27861-bf00-599b-9b24-9a50dbfed7ed"
"urn:wmo:md:ca-eccc-msc:1ee9e14d-0814-5201-a3be-705809d8ee0e"
"urn:wmo:md:ca-eccc-msc:1f11ed9f-b13d-497b-853a-997b991195a1"
"urn:wmo:md:ca-eccc-msc:1f864766-7f7f-4be7-8292-295065c65c78"
"urn:wmo:md:ca-eccc-msc:1fb5ad1e-aa5b-468a-912f-950ca4f2c105"
"urn:wmo:md:ca-eccc-msc:2097f245-2285-58ec-9343-286586c1a715"
"urn:wmo:md:ca-eccc-msc:214499e5-99c6-401f-9d7e-c16611680719"
"urn:wmo:md:ca-eccc-msc:2280b47c-fb7c-4bab-a632-33f59eb82385"
"urn:wmo:md:ca-eccc-msc:22f783d3-9cc0-5d2f-bbcb-7cef872ee6f6"
"urn:wmo:md:ca-eccc-msc:28936e1b-681f-4c73-b04a-e86d4b3917c6"
"urn:wmo:md:ca-eccc-msc:2c2cadd7-5248-4764-bf88-5042b73465c3"
"urn:wmo:md:ca-eccc-msc:2e4fb3b1-1267-5ae1-8a13-91646226dfa0"
"urn:wmo:md:ca-eccc-msc:36129cbc-3997-4b8e-a8bf-5fb44492134d"
"urn:wmo:md:ca-eccc-msc:37aecae5-7783-4274-b595-df02aa003ac3"
"urn:wmo:md:ca-eccc-msc:37d76d67-0304-4e79-8a56-a839097ddd3d"
"urn:wmo:md:ca-eccc-msc:38414289-4beb-4854-af58-5ab3b66665eb"
"urn:wmo:md:ca-eccc-msc:390abee6-4ba0-4d6e-ae79-25753d1c43f3"
"urn:wmo:md:ca-eccc-msc:3959c86b-b555-4ad8-9fcc-8fecfb79918c"
"urn:wmo:md:ca-eccc-msc:3cab386e-8779-57fb-9df5-434bc3eb8ca5"
"urn:wmo:md:ca-eccc-msc:3f319639-de12-57e9-8ea1-d04272333fe5"
"urn:wmo:md:ca-eccc-msc:4058ed59-8d43-4e1d-b264-5dc12c97b99f"
"urn:wmo:md:ca-eccc-msc:42d58945-c08e-5a1b-ac69-53fc5c2f6555"
"urn:wmo:md:ca-eccc-msc:4564cbf5-9de5-4521-b007-a20d73ad6f89"
"urn:wmo:md:ca-eccc-msc:459be8fa-a571-546e-a0f2-1a4788dc95b5"
"urn:wmo:md:ca-eccc-msc:46630a2f-a761-5063-adf8-0838ff3a1ff0"
"urn:wmo:md:ca-eccc-msc:46763060-e859-4812-8da5-2361d99b4c34"
"urn:wmo:md:ca-eccc-msc:47d2d140-999b-564e-a651-21cb2ac9421e"
"urn:wmo:md:ca-eccc-msc:49126bff-0de8-4b9d-8c03-189be2e66261"
"urn:wmo:md:ca-eccc-msc:4b548442-eb95-5a3e-a27e-ed99b2dde6c6"
"urn:wmo:md:ca-eccc-msc:4fe11fe4-242c-4111-80ae-4adb12188533"
"urn:wmo:md:ca-eccc-msc:5257ee75-0787-4f44-9b95-338980124620"
"urn:wmo:md:ca-eccc-msc:52a2b56c-862d-574f-92c2-175603f614f3"
"urn:wmo:md:ca-eccc-msc:53597589-ef66-563b-a912-0a57000029e1"
"urn:wmo:md:ca-eccc-msc:574c32db-aba7-4919-9c9f-c58398754173"
"urn:wmo:md:ca-eccc-msc:5b36f7d6-a381-560a-be3f-99186a89ef51"
"urn:wmo:md:ca-eccc-msc:5b401fa0-6c29-57f0-b3d5-749f301d829d"
"urn:wmo:md:ca-eccc-msc:5dc1e43e-1776-5df8-919e-e229819c8954"
"urn:wmo:md:ca-eccc-msc:5e8247ca-891e-5686-a6a5-0631c50fe2db"
"urn:wmo:md:ca-eccc-msc:5f963c2d-d4ed-5a79-8a31-c9c582ca5098"
"urn:wmo:md:ca-eccc-msc:5fc7ab98-afa1-427b-87b6-658565cca575"
"urn:wmo:md:ca-eccc-msc:5fe5cc5a-f3bd-5363-a25c-eb5c5df71012"
"urn:wmo:md:ca-eccc-msc:6059da1d-e1da-4f2b-a420-b5c2a130eeaa"
"urn:wmo:md:ca-eccc-msc:6070e77c-290d-57fa-8956-2b5ef38fe932"
"urn:wmo:md:ca-eccc-msc:60af030a-9563-5b5b-b477-7407d55b8014"
"urn:wmo:md:ca-eccc-msc:61c4d7be-a8dd-5929-a80b-3d3641334f3f"
"urn:wmo:md:ca-eccc-msc:62c5f03f-8f03-466a-960a-88fbc5882c11"
"urn:wmo:md:ca-eccc-msc:631e570e-59c3-42d7-aa7b-5a4666ab62b5"
"urn:wmo:md:ca-eccc-msc:65d3a88b-eb09-4fd9-ac44-cf42dc1f7444"
"urn:wmo:md:ca-eccc-msc:66caa8cc-0e9c-4fdb-ae40-fab9c255b811"
"urn:wmo:md:ca-eccc-msc:66e22a26-10f6-5518-92fc-ae38ddf4c519"
"urn:wmo:md:ca-eccc-msc:6712547a-7b6e-4746-ac51-e369a1f1f1ee"
"urn:wmo:md:ca-eccc-msc:6a2bd6ff-a9fd-56dc-a6ef-2b954c0bb9a6"
"urn:wmo:md:ca-eccc-msc:6af279bc-62a6-4fe3-987c-ec1af1d3357f"
"urn:wmo:md:ca-eccc-msc:6b02c778-8eaa-46f5-8786-ae80b0ea0f72"
"urn:wmo:md:ca-eccc-msc:6c995b31-3a8e-52a5-a1d0-46cd62f0b416"
"urn:wmo:md:ca-eccc-msc:6d9dd2f8-202e-58cb-a110-e2168832aacb"
"urn:wmo:md:ca-eccc-msc:746f9469-ab78-5dcc-b165-4b51e8ab8652"
"urn:wmo:md:ca-eccc-msc:75dfb8cb-9efc-4c15-bcb5-7562f89517ce"
"urn:wmo:md:ca-eccc-msc:76747350-2faa-490e-90d1-0e5af6e66027"
"urn:wmo:md:ca-eccc-msc:7882f2fd-9856-5a89-aada-0f6f66e19bb5"
"urn:wmo:md:ca-eccc-msc:79550951-6b17-49a6-9028-8ae1c21274cf"
"urn:wmo:md:ca-eccc-msc:7c1070fd-af7d-40fe-9e78-49d2962f0bbc"
"urn:wmo:md:ca-eccc-msc:7cf4ea56-4fad-5c28-a878-709b48b314f0"
"urn:wmo:md:ca-eccc-msc:7e7337b7-d36c-4486-a8df-16609a6b99bd"
"urn:wmo:md:ca-eccc-msc:7f9da2bb-3f63-51c8-91ee-58dbd92fe68a"
"urn:wmo:md:ca-eccc-msc:803a6e2a-41ed-44c2-9eeb-1b5306b4048e"
"urn:wmo:md:ca-eccc-msc:815b18f8-be5e-56a1-b4dc-4e4fbd820b76"
"urn:wmo:md:ca-eccc-msc:840f58f6-29e1-5cbd-8d1e-0997aa210aaf"
"urn:wmo:md:ca-eccc-msc:88a5111c-136c-42a7-907f-523ad4365165"
"urn:wmo:md:ca-eccc-msc:8f86f415-7fef-5377-940c-c169c9854520"
"urn:wmo:md:ca-eccc-msc:8fc6ad48-1a98-45fa-9eba-9a27ddae5014"
"urn:wmo:md:ca-eccc-msc:903b7591-a5ec-4cdd-95e2-0a6cf602685b"
"urn:wmo:md:ca-eccc-msc:922781a9-bfef-56b9-a438-493ada629d47"
"urn:wmo:md:ca-eccc-msc:92c24ae8-3e3a-4264-b8b0-a2a53448f186"
"urn:wmo:md:ca-eccc-msc:9458466a-d137-4744-ae36-35b98fcd165f"
"urn:wmo:md:ca-eccc-msc:9764d6c6-3044-450c-ac5a-383cedbfef17"
"urn:wmo:md:ca-eccc-msc:99786cb6-516c-51d4-bd19-d45961c0a686"
"urn:wmo:md:ca-eccc-msc:9a1fa93e-f67c-50f2-baa7-d5cc35a3efaa"
"urn:wmo:md:ca-eccc-msc:9a6594f9-ad0e-4421-ba9d-16338e5a9cbe"
"urn:wmo:md:ca-eccc-msc:9aff2702-2a88-44be-bfb7-be26c8aca1df"
"urn:wmo:md:ca-eccc-msc:9d777866-bc02-518a-934a-6c0790454863"
"urn:wmo:md:ca-eccc-msc:9dd32bce-c9d2-42d8-ac35-61bf62f6b5c4"
"urn:wmo:md:ca-eccc-msc:9eaf8b65-a734-432e-925c-7fbe8fc65670"
"urn:wmo:md:ca-eccc-msc:9ec1edfc-a280-579c-8dd2-7dac06cf407b"
"urn:wmo:md:ca-eccc-msc:a0b9584a-e537-5a87-a1bc-3aa58f0569a0"
"urn:wmo:md:ca-eccc-msc:a0e5c7a1-03df-413b-9b04-8e9d41099c19"
"urn:wmo:md:ca-eccc-msc:a115568f-3edc-42cb-ad31-e99f3cf5e37e"
"urn:wmo:md:ca-eccc-msc:a563e47d-6eb9-4f7f-933c-222ae49fe57f"
"urn:wmo:md:ca-eccc-msc:a99e4f08-65ad-532a-997d-3c3687253202"
"urn:wmo:md:ca-eccc-msc:a9f2828c-0d78-5eb6-a4c7-1fc1219f1e3d"
"urn:wmo:md:ca-eccc-msc:aa675a68-3b65-5721-8c2e-e92d8b32f2b2"
"urn:wmo:md:ca-eccc-msc:aae10768-0c0c-4670-807e-8e893680887e"
"urn:wmo:md:ca-eccc-msc:ab8e1cdd-e41c-5f28-a26a-4d2201c82fbb"
"urn:wmo:md:ca-eccc-msc:ae83e85b-82dd-522d-8eeb-cd63daa73987"
"urn:wmo:md:ca-eccc-msc:b1bf07a9-0bb7-5961-aa62-dc33c53d8f80"
"urn:wmo:md:ca-eccc-msc:b234671d-1720-5965-a2ea-7cd6d6b2e68a"
"urn:wmo:md:ca-eccc-msc:b24efb37-11b6-5d03-ab19-5759f83db546"
"urn:wmo:md:ca-eccc-msc:b8f0adf3-3ed1-5782-9990-729e10105a78"
"urn:wmo:md:ca-eccc-msc:ba198d6d-1c80-52f6-9801-4a444ab337a7"
"urn:wmo:md:ca-eccc-msc:bb0d1eeb-0e11-49e0-a5e3-6d99d4decb31"
"urn:wmo:md:ca-eccc-msc:bc1dfbf9-039c-5538-a2a8-a55991405b6b"
"urn:wmo:md:ca-eccc-msc:bc52b7a8-46ef-4a7f-90e0-7780abac398c"
"urn:wmo:md:ca-eccc-msc:bdae26c4-a662-5fb7-8be9-5079bd69c7ff"
"urn:wmo:md:ca-eccc-msc:bde9b113-ab40-4d7f-a501-5cbb0b55805c"
"urn:wmo:md:ca-eccc-msc:bdf58342-edcf-51a7-959a-29ccd29110ba"
"urn:wmo:md:ca-eccc-msc:bf1884e2-cbbb-4a50-ab40-c5b417723d17"
"urn:wmo:md:ca-eccc-msc:bfe44cce-a9c4-467f-9172-c8800b32e4ec"
"urn:wmo:md:ca-eccc-msc:c041e79a-914a-5a4e-a485-9cbc506195df"
"urn:wmo:md:ca-eccc-msc:c1a52a5f-0f52-593c-a38f-6df4a186c6b4"
"urn:wmo:md:ca-eccc-msc:c4873551-7c7a-5ff7-97d5-ee75e8b44bf5"
"urn:wmo:md:ca-eccc-msc:c496cff9-069a-58ad-a6cc-20776c797b6c"
"urn:wmo:md:ca-eccc-msc:c7c9d726-c48a-49e3-98ab-78a1ab87cda8"
"urn:wmo:md:ca-eccc-msc:c884d749-a7db-4713-9f81-0d996d0d1201"
"urn:wmo:md:ca-eccc-msc:c944aca6-0d59-418c-9d91-23247c8ada17"
"urn:wmo:md:ca-eccc-msc:ca.gc.ec.msc-1.1.1.3"
"urn:wmo:md:ca-eccc-msc:cccb0064-5ab3-416a-a4f0-566b54f466f3"
"urn:wmo:md:ca-eccc-msc:cd41b178-b58d-4dd2-8b32-8e567ff6baed"
"urn:wmo:md:ca-eccc-msc:cda0ebec-8592-5881-976e-fd030787cb84"
"urn:wmo:md:ca-eccc-msc:ce9e475b-3e3b-4b15-9ac4-165549366b09"
"urn:wmo:md:ca-eccc-msc:cf6077e9-d104-54de-bef4-26cb17191eea"
"urn:wmo:md:ca-eccc-msc:d244c9fa-776f-446f-9ccf-1d575cc21a5c"
"urn:wmo:md:ca-eccc-msc:d56bdd04-fcb7-52fb-b04f-dc0ad3d82f64"
"urn:wmo:md:ca-eccc-msc:d7fedd91-2c7d-58f7-b780-9392744e550f"
"urn:wmo:md:ca-eccc-msc:db262eb3-98db-5ad4-8da2-0ea1cfec9b5e"
"urn:wmo:md:ca-eccc-msc:dc3a7022-95e8-45a7-bf63-3d45b6cda0dc"
"urn:wmo:md:ca-eccc-msc:df2e6e1a-6057-4c4d-a509-94aa57705a8c"
"urn:wmo:md:ca-eccc-msc:e2233308-040e-47da-8c9d-2551ff99f810"
"urn:wmo:md:ca-eccc-msc:e247a8b3-75b6-585e-8ab4-ac3600534799"
"urn:wmo:md:ca-eccc-msc:e50a9544-eee2-460c-a8b1-1a92a487d060"
"urn:wmo:md:ca-eccc-msc:eff69d42-ce81-4672-867f-cc3baaf4157a"
"urn:wmo:md:ca-eccc-msc:f3ec3d35-e9ca-521c-998a-a88db31c6205"
"urn:wmo:md:ca-eccc-msc:f702621d-65f2-5c40-bafb-fcf4d62a63cd"
"urn:wmo:md:ca-eccc-msc:f73d6939-912a-4add-a291-c233fc5d1946"
"urn:wmo:md:ca-eccc-msc:f8b1490e-873e-480a-9037-6eb075f63106"
"urn:wmo:md:ca-eccc-msc:fb20aaa2-d206-5ea2-a574-b3cccc0eb4dc"
"urn:wmo:md:ca-eccc-msc:fba0e890-a7cc-58f2-bde0-89b8d99acf59"
"urn:wmo:md:ca-eccc-msc:fdd3446a-dc20-5bad-9755-0855e3ec9b19"
"urn:wmo:md:ca-eccc-msc:ffa61839-e219-5c52-9fa8-8d26447fb2bf"
"urn:wmo:md:ca-eccc-msc:ffcf85e1-bb6e-5b55-82ca-47db7fe63042"
"urn:wmo:md:cg-met:core.surface-based-observations.synop"
"urn:wmo:md:cu-insmet:core.surface-based-observations.synop"
"urn:wmo:md:cy-dom:surface-based-observations.synop"
"urn:wmo:md:cy-dom:surface-based-observations.temp"
"urn:wmo:md:cy-dom:weather.prediction.deterministic.local"
"urn:wmo:md:cy-dom:weather.radar"
"urn:wmo:md:de-dwd:icon-eps.ALL"
"urn:wmo:md:de-dwd:weather.observations"
"urn:wmo:md:il-ims:weather.observations.international"
"urn:wmo:md:il-ims:weather.observations.swob-realtime"
"urn:wmo:md:in-imd:satellite"
"urn:wmo:md:in-imd:surface-based-observations.temp"
"urn:wmo:md:ir-irimo:core.surface-based-observations.synop"
"urn:wmo:md:ir-irimo:core.surface-based-observations.temp"
"urn:wmo:md:kr-kma:core.aviation.experimental.amda"
"urn:wmo:md:kr-kma:core.aviation.metar"
"urn:wmo:md:kr-kma:core.climate.surface.monthly"
"urn:wmo:md:kr-kma:core.surface-based-observations.monthly"
"urn:wmo:md:kr-kma:core.surface-based-observations.moored-buoys-and-moorings"
"urn:wmo:md:kr-kma:core.surface-based-observations.synop"
"urn:wmo:md:kr-kma:core.surface-based-observations.temp"
"urn:wmo:md:kr-kma:surface-based-observations.experimental.aws"
"urn:wmo:md:kz-kazhydromet:core.surface-based-observations.synop"
"urn:wmo:md:kz-kazhydromet:core.surface-based-observations.temp"
"urn:wmo:md:ma-marocmeteo:surface-based-observations.synop"
"urn:wmo:md:pl-imgw:surface-based-observations.synop"
"urn:wmo:md:ru-aviamettelecom:core.surface-based-observations.synop"
"urn:wmo:md:ru-aviamettelecom:core.surface-based-observations.temp"
"urn:wmo:md:sz-swazimet:surface-based-observations.synop"
"urn:wmo:md:td-anam:core.surface-based-observations.synop"
"urn:wmo:md:td-anam:core.weather.cap"

@josusky
Copy link
Contributor

josusky commented Dec 16, 2024

No "special" characters as far as I can see. However, the dominance of "urn:wmo:md:ca-eccc-msc:UUID" is alarming.

@tomkralidis
Copy link
Collaborator

No "special" characters as far as I can see. However, the dominance of "urn:wmo:md:ca-eccc-msc:UUID" is alarming.

? Is there an issue we need to address?

@josusky
Copy link
Contributor

josusky commented Dec 16, 2024

Well, either the centre ca-eccc-msc defined too many products or all the other centres too few :-)

@tomkralidis
Copy link
Collaborator

Good problem to have for ECCC then :)

@tomkralidis
Copy link
Collaborator

tomkralidis commented Feb 18, 2025

TT-WISMD 2025-02-18:

  • T.50
  • no colons (id delimiter) already a permission
  • for readability: no accents, no spaces, semi-colons

ACTION: @tomkralidis to PR for @wmo-im/tt-wismd review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants