-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geometric lookups and "vector data cubes" #748
Comments
Dimensional data.jl already handles point lookups a bit, see |
|
For sure. Except for the nesting, and there should be optimisations for finding polygons on these grids? |
If you are interested in DGGS there is also https://github.com/danlooo/DGGS.jl |
Thanks Felix, that seems interesting! Are those actually DimArrays though, or just convenient wrapper structs? I went over the code briefly but am not able to see how this is working... |
I just wrote up a more interesting vector data cube example with polygons. See the following: using DimensionalData
using DimensionalData.Lookups
using DimensionalData.Dimensions
import DimensionalData as DD
# load data
using NaturalEarth
import GeoInterface as GI, GeometryOps as GO
fc = naturalearth("admin_0_countries", 10)
geometries_data = fc.geometry
id_data = fc.ADM0_A3
pop_data = fc.POP_EST .|> Float64
# now for the fun part
# let's define a dimension named Geometry
# TODO: how can we make this be interpreted as (X, Y)? can we somehow hook into the same things that MergeDims does?
@dim Geometry
# create a categorical lookup from the geometry
geometry_lookup = Categorical(geometries_data)
# create dimvectors which have geometry lookups - simple so far
id_dd = DimVector(id_data, Geometry(geometry_lookup))
pop_dd = DimVector(pop_data, Geometry(geometry_lookup))
# stack them so they share dims
fc_stack = DimStack((; id = id_dd, pop = pop_dd))
# Now for the _really_ fun part: mixing dimensions
# create a 2d dimarray, that hypothetically has height measurements over time for each geometry
height_dim_matrix = rand(Geometry(geometry_lookup), Ti(1:10))
# stack 'em!
full_stack = DimStack((; id = id_dd, pop = pop_dd, height = height_dim_matrix))
# select the USA
full_stack[Geometry(Where(x -> GO.contains(x, (-103, 44))))]
full_stack[Geometry(Where(x -> GO.contains(x, (-103, 44))))].id
full_stack[Geometry(Where(x -> GO.contains(x, (-103, 44))))].pop
full_stack[Geometry(Where(x -> GO.contains(x, (-103, 44))))].height
this is pretty sweet I think...now just have to figure out how to actually write this in a CF compliant way, and how to detect these lookups when plotting 😢 |
Just using I guess we can accelerate that for GO predicates with an Extent index, if we use a Fix2 |
yep! I'm building out a PolygonLookup now that would support that, but we could change that to be a more generic GeometricLookup at some point that works for any geometries |
btw the big issue here is probably going to be file I/O. Any suggestions on that? NetCDF has some support for this but I have no clue how we would get this out from a nc file: example.nc.txt |
Have you looked at the python/R vector data cubes ? If we can use something already being standardised... (The CF standard is likely pretty arcane) |
This is really exciting! And making this a
|
Yeah...Ideally we would want a sliced dimtable from the stack, with a given index column |
They also use CF :(( |
Ok guess we will have to implement it... |
For the IO part, there might be some inspiration in xarray-contrib/xvec#26 |
Lat-lon-Cap grid used in NASA reanalyses and MITgcm : https://juliaclimate.github.io/MeshArrays.jl/dev/tutorials/geography.html |
|
There should always be some selector for X and Y, i.e |
We are now working on non-CF encoding that supports geometry as variable, not only dimension coordinates. The CF encoding is indeed a bit inconvenient but at least it is an existing standard. Most of that lives on my branch so far but here's the notebook illustrating the idea https://github.com/martinfleis/xvec/blob/summary/doc/source/io.ipynb. |
Thanks @martinfleis that's a great write-up @asinghvi17 I just realised the nested discrete global grid can be represented with a DimTree and geometry lookups. I think it will already work if we merged the branches. |
Thanks for the write up, that clarifies a lot! @rafaqz we can indeed do a dimtree, but there is one issue - we may need to implement s2 indexing 😅 or some other accelerate, at least for either DGGS or known spherical/ellipspodal geometries. |
The generic geometry lookup we have should work, but slowly? But yes we need to make it abstract and specialise a few fast cases like hexagons as @meggart mentioned they are doing in DGGS |
Regarding dimtrees, note that a cell may have multiple parents in some DGGS, e.g. here |
I think that's fine, the tree isn't spatially nested explicitly like that (although it seems there could be some nice optimisations where that is possible) It just allows layers with different lookups for the same dimensions, that could at the same time share e.g. the time dimension from lower in the tree. So selecting The extent will be the union over all branches, so they don't have to match at the boundaries either. Branches could equally represent separate raster tiles of the same resolution. |
The only potential hiccup here is that DGGS are, at this point, more like matrix (actually 3d array) lookups than like a single vector flat geometry lookup. |
So it would need a custom ArrayLookup and Nearest neighbour searches or something for At/Near etc? We could just tweak |
@asinghvi17 Both are possible, even on the same grid, e.g. Uber H3 normal 1D bit index vs 3D IJK index. Hierarchical indices are usually 1D, having the parent cell as a prefix, optimized for aggregation queries. However, when it comes to distant neighbour queries, e.g. bouding boxes for viewing and convolutions, one wants to have multiple spatial DGGS dimensions more like a matrix. |
It's super easy to create dimension axes that have vector lookups:
but some questions arise here.
color
as the values of the dimvector. What about 2d arrays?Contains
,Touches
,..
,At
, etc work, potentially using GeometryOps? Do we want an extension for this? I think this will involve geometric predicates at least.GI.geometrycolumns
of a raster to point to geometric lookups? Do we want to stash this in the metadata?The text was updated successfully, but these errors were encountered: