-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting a flexible dataset schema #37
Comments
@abkfenris and I have talked about this before, because a lot of the (not super thought out) abstractions for xpublish-wms overlap a ton with EDR functionality, and same with our newer subsetting prototype plugin. Both of these are built on top of cf-xarray, which as you mentioned is probably the most important piece here. |
Xpublish-EDR should be using CF-Xarray's coordinate criteria to figure out which coordinates correspond to Here's the attributes on one of my datasets that is playing nice with Xpublish-EDR/CF-Xarray (though my Xpublish server is currently being cranky, so I gotta go figure that out): ![]() |
Thanks both for your input here. I'm realizing now that I jumped to a conclusion about CF (and cf-xarray) that may not have been correct. I'll do a bit more digging and return with a more concrete request. |
I think there is probably a place for configuration on both the plugin and dataset sides. We could enable setting up the ERD plugin with some custom coordinate criteria so folks could enable matching on patterns that are common in their datasets. On the dataset side, it would probably make sense to have a standard attribute pattern that can prompt plugins but doesn't get sent to users by plugins that would otherwise share attributes.
|
After looking at this in more detail, I understand the problem -- our dataset is simply missing the The feature request I would offer here then is to have some sort of verification tool that runs when registering a dataset with a router. If an error or warning had been issued at startup, this would have been much easier to debug. |
Hmm, I don't think we'd want to make it the default, since with dataset providers Xpublish doesn't necessarily know of all datasets at startup. We also don't have a hook currently to call plugins on startup, but after datasets may be available. How about an endpoint to check if the dataset has the right attributes? There is an existing issue in CF-Xarray for some sort of checker xarray-contrib/cf-xarray#366 but I don't think we need to be blocked by that. I think we can do a simpler query to see if CF-Xarray has identified the axes. |
Xpublish-EDR currently support a very narrow dataset schema. For example, coordinates must be named
X
,Y
, andT
corresponding to longitude, latitude, and time. When datasets do not conform to this, there is a somewhat confusing error raised about CF conventions:xpublish-edr/xpublish_edr/plugin.py
Lines 79 to 85 in 019e53a
Of course, there are many ways a dataset can be CF complaint (with or without these coordinates present). My understanding of the EDR spec is that it is agnostic to the names of these coordinates in the underlying data source. So Xpublish-EDR should be able to find a way to bridge between coordinate names provided they are real CF.
Can we work on making Xpublish-EDR more flexible here? CF-Xarray could offer a shortcut here.
The text was updated successfully, but these errors were encountered: