-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XarrayReader produces upside down images for some datasets with xarray backend #34
Comments
After digging into the issue some more I think the problem is in the default CRS definition for non-georeferenced datasets in titiler-cmr: titiler-cmr/titiler/cmr/reader.py Lines 200 to 201 in afe8b6d
The dataset that works as expected (GAMSSA) contains a georeference while the dataset that is not working correctly (MUR SST) does not contain a georeference and has it manually set to diff --git a/titiler/cmr/reader.py b/titiler/cmr/reader.py
index 979629b..5e0af7d 100644
--- a/titiler/cmr/reader.py
+++ b/titiler/cmr/reader.py
@@ -197,7 +197,10 @@ def get_variable(
da = da.sortby(da.x)
# Make sure we have a valid CRS
- crs = da.rio.crs or "epsg:4326"
+ crs = (
+ da.rio.crs
+ or 'GEOGCS["undefined",DATUM["undefined",SPHEROID["undefined",6379137,298.257223563]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Longitude",EAST],AXIS["Latitude",NORTH]]'
+ )
da.rio.write_crs(crs, inplace=True) This does not explain why everything works when a reprojection operation is performed without any changes to the default CRS, but it suggests that defaulting to |
rioxarray will set the Affine transformation via the crs attribute and coordinates if it is not already defined in the dataset, which means the data gets "flipped" properly to match the destination transform in the reprojection step. rio_tiler's XarrayReader skips the reprojection entirely if the source and destination crs are the same - https://github.com/cogeotiff/rio-tiler/blob/abec15162c1d8cf9dd67cb5945d8532e2b6f562c/rio_tiler/io/xarray.py#L249-L263. If the I think the automatic assignment of a CRS is a broader issue. I don't agree with assuming that it's "epsg:4326" if it's not in the metadata. I guess there are a lot of datasets that are missing that metadata, so there should be a way for users to get around that still. Maybe a query parameter in the titiler-cmr endpoints? |
After more investigation with @maxrjones and others, the problem is actually pretty simple. Neither of these datasets follow the convention of y axis coordinates positive upwards in the arrays. The GAMSSA dataset has a georeference attached to it that The MUR SST dataset does not have a georeference attached to it and gets the
We should propose a fix in rio-tiler to ensure the array that is passed to
Yes I agree. We should add a query parameter for |
Here's the roughly equivalent import earthaccess
import rio_tiler
import xarray as xr
import matplotlib.pyplot as plt
from odc.geo.geom import Geometry
from shapely.geometry import box
# Download dataset
earthaccess.login()
variable = "analysed_sst"
results = earthaccess.search_data(
concept_id="C2036881735-POCLOUD", count=1, temporal=("2024-09-20", "2024-09-20")
)
fp = earthaccess.download(results, "earthaccess_data")[0]
# Set up GeoJSON for reader
bbox = box(-99.524, 19.043, -78.647, 31.126)
gbox = Geometry(bbox, "EPSG:4326")
shape = gbox.geojson()
# Use Xarray reader
ds = xr.open_dataset(fp)
ds = ds.rio.write_crs("EPSG:4326")
r = rio_tiler.io.xarray.XarrayReader(ds[variable])
im = r.feature(shape)
plt.imshow(im.data.squeeze())
plt.colorbar()
plt.show()
# Use rasterio reader
r = rio_tiler.io.rasterio.Reader(f"NETCDF:{str(fp)}:analysed_sst")
im = r.feature(shape)
plt.imshow(im.data.squeeze())
plt.colorbar()
plt.show() I'm not sure where mask_and_scale and masking gets handled with using the rasterio reader, which is the reason for the difference in values. I could open a PR tomorrow suggesting a change to the Xarray reader to return consistent array structure. |
@maxrjones can you try import earthaccess
import rio_tiler
import xarray as xr
import matplotlib.pyplot as plt
from geojson_pydantic import Polygon
# Download dataset
earthaccess.login()
variable = "analysed_sst"
results = earthaccess.search_data(
concept_id="C2036881735-POCLOUD", count=1, temporal=("2024-09-20", "2024-09-20")
)
fp = earthaccess.download(results, "earthaccess_data")[0]
# Set up GeoJSON for reader
bbox = (-99.524, 19.043, -78.647, 31.126)
shape = Polygon.from_bounds(*bbox).model_dump(exclude_none=True)
# Use Xarray reader
print("xarray")
ds = xr.open_dataset(fp)
ds = ds.rio.write_crs("EPSG:4326")
r = rio_tiler.io.xarray.XarrayReader(ds[variable])
im = r.feature(shape)
plt.imshow(im.data_as_image())
plt.colorbar()
plt.show()
print("rasterio")
# Use rasterio reader
r = rio_tiler.io.rasterio.Reader(f"NETCDF:{str(fp)}:analysed_sst")
im = r.feature(shape)
plt.imshow(im.data_as_image())
plt.colorbar()
plt.show() |
Thank you both for digging on this issue 🙏
I totally agree 🙏
I'm a bit worry about passing use WKT crs via query parameter, but I don't see better solution |
if we agree that not setting a CRS automatically in the XarrayReader, then might get issue when using titiler-cmr backend because we won't be able to easily forward the Error message up to the users 😬 |
ok I think I may have a solution import xarray
ds = xarray.open_dataset("20241012090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc", decode_coords="all")
da = ds["analysed_sst"]
print(da.rio.bounds())
print(da.rio.crs)
print(da.rio.transform())
print(da.rio._unordered_bounds())
(-179.99500549324037, -89.99499786365084, 180.0050000000763, 89.99499786365084)
None
| 0.01, 0.00,-180.00|
| 0.00, 0.01,-89.99|
| 0.00, 0.00, 1.00|
(-179.99500549324037, 89.99499786365084, 180.0050000000763, -89.99499786365084) if you look at the rioxarray reorder the bounds, so if you look at the difference between |
This is another case where the This import earthaccess
import matplotlib.pyplot as plt
import rio_tiler
# query MUR SST from CMR
earthaccess.login()
results = earthaccess.search_data(
concept_id="C1996881146-POCLOUD", count=1, temporal=("2024-10-22", "2024-10-22")
)
url = results[0].data_links()[0]
# Use rasterio reader to read a bbox
bbox = (-99.524, 19.043, -78.647, 31.126)
variable = "analysed_sst"
r = rio_tiler.io.rasterio.Reader(f"vrt:///vsicurl/{url}?sd_name={variable}")
im = r.part(bbox)
plt.imshow(im.data_as_image())
plt.show() Does the |
This example file has four dimensions: https://www.unidata.ucar.edu/software/netcdf/examples/tos_O1_2001-2002.nc You can pick the variable with gdalinfo "vrt:///vsicurl/https://www.unidata.ucar.edu/software/netcdf/examples/tos_O1_2001-2002.nc?sd_name=tos&bands=1" |
How would this work with rio-tiler? I tried:
|
You shouldn't need the /vsicurl/ prefix with rasterio |
Good to know, thanks! The
@hrodmn FYI while that dataset has 4 dimensions, the tos variable only has three. |
I'm not sure if this is a problem in titiler-cmr or somewhere upstream (rio-tiler) but when reading images from some NetCDF files without doing a reprojection operation, the images are returned upside down!
This request to one NetCDF dataset returns the expected result:
https://9ox7r6pi8c.execute-api.us-west-2.amazonaws.com/bbox/-99.524,19.043,-78.647,31.126.png?concept_id=C2036881735-POCLOUD&datetime=2024-09-20T00:00:00Z/2024-09-21T00:00:00Z&variable=analysed_sst&backend=xarray&rescale=273,325&colormap_name=thermal&return_mask=true
A request to a different NetCDF dataset (MUR SST) with all of the same parameters yields an upside down image!
https://9ox7r6pi8c.execute-api.us-west-2.amazonaws.com/bbox/-99.524,19.043,-78.647,31.126.png?concept_id=C1996881146-POCLOUD&datetime=2024-09-20T00:00:00Z/2024-09-21T00:00:00Z&variable=analysed_sst&backend=xarray&rescale=273,325&colormap_name=thermal&return_mask=true
/tile
requests to the second dataset return the expected result:https://9ox7r6pi8c.execute-api.us-west-2.amazonaws.com/tiles/WebMercatorQuad/5/8/13.png?concept_id=C1996881146-POCLOUD&datetime=2024-09-20T00:00:00Z/2024-09-21T00:00:00Z&variable=analysed_sst&backend=xarray&rescale=273,325&colormap_name=thermal&return_mask=true
When I specify a
dst_crs
that is not the dataset's native CRS,/bbox
produces a correct image:https://9ox7r6pi8c.execute-api.us-west-2.amazonaws.com/bbox/-99.524,19.043,-78.647,31.126.png?concept_id=C1996881146-POCLOUD&datetime=2024-09-20T00:00:00Z/2024-09-21T00:00:00Z&variable=analysed_sst&backend=xarray&rescale=273,325&colormap_name=thermal&return_mask=true&dst_crs=epsg:3857
The text was updated successfully, but these errors were encountered: