Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bbox and footprint in meters #26

Closed
smit1678 opened this issue Apr 26, 2017 · 15 comments
Closed

bbox and footprint in meters #26

smit1678 opened this issue Apr 26, 2017 · 15 comments
Assignees

Comments

@smit1678
Copy link
Member

An error popped up after processing new imagery on production. Metadata is generated fine (example), but when the catalog indexes the new image, mongodb throws an error:

message: 'exception: Can\'t extract geo keys: { _id: ObjectId(\'58fffebeb0eae7f3b143c1a8\'), uuid: "http://oin-hotosm.s3.amazonaws.com/58fff92190a38300103c37c8/0/0cb3651f-d796-4e6a-ab58-ecaf12569d0b.tif", geojson: { bbox: [ 318567.4056300001, 7842361.890440002, 318843.0565999987, 7842607.837750001 ], coordinates: [ [ [ 318567.40563, 7842607.83775 ], [ 318843.056599999, 7842607.83775 ], [ 318843.056599999, 7842361.89044 ], [ 318567.40563, 7842361.89044 ], [ 318567.40563, 7842607.83775 ] ] ], type: "Polygon" }, meta_uri: "http://oin-hotosm.s3.amazonaws.com/58fff92190a38300103c37c8/0/0cb3651f-d796-4e6a-ab58-ecaf12569d0b_meta.json", footprint: "POLYGON ((318567.40563 7842607.83775,318843.056599999 7842607.83775,318843.056599999 7842361.89044,318567.40563 7842361.89044,318567.40563 7842607.83775))", bbox: [ 318567.4056300001, 7842361.890440002, 318843.0565999987, 7842607.837750001 ], gsd: 0.02512999999987835, projection: "PROJCS["WGS 84 / UTM zone 59S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","63...", file_size: 24653540, uploaded_at: new Date(1493170469325), acquisition_end: new Date(1428188820000), acquisition_start: new Date(1428185220000), properties: { license: "CC-BY 4.0", sensor: "Unknown", wmts: "http://tiles.openaerialmap.org/58fff92190a38300103c37c8/0/0cb3651f-d796-4e6a-ab58-ecaf12569d0b/wmts", tms: "http://tiles.openaerialmap.org/58fff92190a38300103c37c8/0/0cb3651f-d796-4e6a-ab58-ecaf12569d0b/{z}/{x}/{y}.png", thumbnail: "http://oin-hotosm.s3.amazonaws.com/58fff92190a38300103c37c8/0/0cb3651f-d796-4e6a-ab58-ecaf12569d0b_thumb.png" }, contact: "Keiko Saito,[email protected]", provider: "Government of Vanuatu", platform: "uav", title: "Tanna Letobom Ortho 05042015_1007", __v: 0 } 
 longitude/latitude is out of bounds, lng: 318567 lat: 7.84261e+06',

It seems that for these images the bbox and footprint units are in meters, when mongo expects them to be in geographic coordinates, latitude and longitude.

cc @tombh

@smit1678
Copy link
Member Author

Example of old and new metadata of the same original image.

@tombh
Copy link
Member

tombh commented Apr 26, 2017

That's an interesting problem because the projection definition is metric, so all those coordinates are the unchanged native ones. Strictly, should we not update the projection definition as well if the bbox and footprint are converted? Or there needs to be something at least noted in the spec that bbox and footprint are always in a geographic projection that won't necessarily match the projection field, or at best indicate it in the actual JSON; eg; {EPSG4326: {footprint:{}, bbox:{}}, native:{footprint:{}, bbox:{}}}.

Or perhaps it should be the responsibility of the consumer (catalog API, OAM browser, etc), to convert into their preferred projection?

@smit1678
Copy link
Member Author

Well I'm not sure we need to define the projection of the bbox and footprint. While we're not making geojson, if we follow the geojson spec, then we should have these in geographic coordinates. But the OIN metadata spec says that the bbox should be in the CRS units.

I guess where is the problem coming from for mongo?

@tombh
Copy link
Member

tombh commented Apr 26, 2017

I haven't looked into the Catalog API enough yet to have much idea what's going on, but I'll see what I can find.

But I think this raises the possibility that up till now oin-metadata consumers have expected bbox and footprint to be geographic. For instance how do these new metric projections appear in the OAM browser? Are their footprints appearing in the correct place on the globe, etc?

@mojodna
Copy link
Contributor

mojodna commented Apr 26, 2017

I'm sorry, I didn't look closely enough at the metadata before. I was so focused on GSD being in meters that I lost track of the fact that the bounding box and footprint changed. It makes sense to have them consistent with the projection that's included (otherwise, why bother including it?), though it does make it more of a pain to work with vs. keeping coordinates in geographic coordinates (but GSD units in meters).

@smit1678
Copy link
Member Author

I'm sorry, I didn't look closely enough at the metadata before.

No worries, we tested this on specific images that had the GSD problem and so these are images that didn't have the GSD problem.

Do you think the right approach would be to do geographic coordinates for bbox and then image CRS for footprint?

Main reason might be for queries like ?bbox=[lon_min],[lat_min],[lon_max],[lat_max] in the catalog API. I don't think we'd be wanting to have this be in meters in a specific projection.

@mojodna
Copy link
Contributor

mojodna commented Apr 26, 2017

Do you think the right approach would be to do geographic coordinates for bbox and then image CRS for footprint?

Yes. We may need to clarify the OIN metadata docs though.

Footprint isn't actually a footprint (it's the bounding box in WKT--why WKT and not GeoJSON? yes I'm extremely late to the party on this). The new tiling workflow generates _footprint.json GeoJSON files that we could take advantage of at some point.

Main reason might be for queries like ?bbox=[lon_min],[lat_min],[lon_max],[lat_max] in the catalog API. I don't think we'd be wanting to have this be in meters in a specific projection.

That sounds exactly right.

@tombh
Copy link
Member

tombh commented Apr 26, 2017

I'm getting the impression this is a pretty big issue. As I mentioned earlier, it appears that many, if not all, consumers of the OIN meta data are already coded to assume bbox is in EPSG4326. Here is just one example in oam-browser: adding imagery as a new layer in map.js. There are certainly many more places where this assumption is made and I suspect others - as we can see from the mongodb error referenced here.

Ultimately I believe all coordinates in the meta data should be in the same CRS as specified by projection. It is the principle of least surprise and keeps OIN as implementation-independent as possible - OIN shouldn't put bbox in a different CRS just because one consumer prefers it (as much as I agree that it makes things easier for us right now).

Of course we are being faced with a rather intricate and involved problem here. The immediate solution is to revert all bbox values to EPSG4326, which I think we should do, but I think we need to be clear that it's a temporary hotfix. Ideally we need to move the responsibility for ensuring CRS compatibility to the consumers. Which may be a long and unforgiving task. However, I feel such strict separation of concerns is required for a formal specification, especially if we want to promote interoperability.

If I'm understanding right, the ?bbox... query Nate mentions (is this it?) is to find all imagery in the catalog that falls within an arbitrary region. I must strongly disagree that that is therefore a good reason for requiring the OIN spec to use a fixed CRS for bbox. Such reasoning binds OIN to OAM's specific implementation - it blurs the boundaries between them. The API should declare that it expects bbox coords in ESPG4326 and internally take the responsibility of matching those coords to whatever coords it has in its database. Of course OIN and OAM are very much related, but if we want OIN to exist as its own entity then we must treat it as such.

Another solution I think is reasonable is to convert all imagery to a standard CRS, the obvious candidate being EPSG4326 and then changing the projection field to original_projection.

It's easy to say in hindsight, but such bugs as these are a classic example of why integration tests are useful and why simplifying a system is useful such that said integration tests are as easy to write as possible.

A brief comment on the slightly unrelated footprint discussion. There seems to be some confusion. Firstly am I right in thinking that strictly speaking a footprint is different from a bounding box, in that it need not be square, or consist of only 4 points? In which case, then oin-meta-generator is not generating that. Indeed to truly do so requires tracing the edges of a raster. But maybe that's not what's meant by a 'footprint'. Perhaps the only difference is that a footprint can be rotated and a bounding box's edges are always parallel and orthogonal to its CRS? Currently, as Seth points out, the OIN's footprint is merely the bounding box but in a different format. As far as I know the only place the footprint is used is in OAM browser to decide whether a given image intersects with a grid square. However, I can't see why the bounding box wouldn't serve exactly the same purpose. So it would be good to clear all this up.

@smit1678
Copy link
Member Author

Footprint isn't actually a footprint (it's the bounding box in WKT--why WKT and not GeoJSON? yes I'm extremely late to the party on this).

I actually don't remember the exact reasoning there.

A brief comment on the slightly unrelated footprint discussion.

Let's silo the footprint convo and bring that into another ticket since I'm not sure the answer at the moment and it seems tangential to the units issue.

Ultimately I believe all coordinates in the meta data should be in the same CRS as specified by projection.

I agree with the idea and the fact that the original spec says the same. Unless there is a reason for the current implementation being different from the spec.

The immediate solution is to revert all bbox values to EPSG4326, which I think we should do, but I think we need to be clear that it's a temporary hotfix.

Is the quick and best solution to fix it in the metadata generator, or is the better solution to fix the ingestion of projected coordinates into the catalog? Ultimately the catalog needs to have bbox queries in a geographic coordinate format for mapping libraries.

If the best solution is to fix the ingestion of the metadata, then let's close this issue out and move the conversation to the oam-catalog.

@tombh tombh self-assigned this Apr 26, 2017
@tombh
Copy link
Member

tombh commented Apr 26, 2017

Personally I think the better solution is certainly to fix the ingestion of projected coordinates into the catalog. The trouble is, as I mentioned, that the catalog isn't the only place working on the assumption that all coordinates will be ESPG4326, so the task involves a lot of refactoring over more than one repo.

So I've pushed a PR reverting bbox and footprint to ESPG4326. That can buy us some time whilst we make a plan to support OIN's single-CRS spec.

So, let's merge, test and deploy the fix. And open 2 new issues;

  • Clarifying the use of footprint.
  • Roadmap for moving to a strict single-CRS OIN spec.

@mojodna
Copy link
Contributor

mojodna commented Apr 27, 2017

Ultimately I believe all coordinates in the meta data should be in the same CRS as specified by projection.

I agree with the idea and the fact that the original spec says the same. Unless there is a reason for the current implementation being different from the spec.

My take on the UX disagrees with the spec (but may agree with how things are used in practice), but here it is anyway:

  • files should be stored in whatever projection they were provided--reprojecting to 4326 will be lossy, especially for higher-resolution imagery
  • all publicly-exposed coordinates (as metadata, in OAM catalog URLs, etc) should be in 4326, since it's a geographic coordinate system and "normal" (and human-readable)
  • bbox should be a 4-element array in geographic coordinates (4326): minX, minY, maxX, maxY
  • footprint should be a GeoJSON geometry (this is JSON already), in geographic coordinates
  • (projection may be omitted in favor of discovering it when working with files directly)

@mojodna
Copy link
Contributor

mojodna commented Apr 27, 2017

I guess I look at the metadata as being equivalent to a bibliography written in English (4326) referring to books in a variety of languages (alternate projections). When working with the bibliography, it's convenient that everything is in the same language, even if the actual books aren't.

@tombh
Copy link
Member

tombh commented Apr 27, 2017

Seth, I really like that metaphor! In fact, I wonder if that can be put as a quote at the top of the README!? It really helps focus the fundamental purpose I think.

And I'm totally on board with your points on the spec from a UX perspective (consumer-centric is so often the best perspective). They give the spec coherency and consistency. Basically it prioritises 4326 as a lingua franca, we can make that clear in the spec with something like, "all coords are 4326 unless stated otherwise".

Not that it should effect our decision making, but this approach of course means less refactoring for OIN consuming code like the Catalog API.

If we take this approach, it feels like projection should be subsumed within properties to reflect its demotion of significance. properties is optional which appropriately reflects on projection being optional. I'd even like to see projection become something like file_projection, 'explicit is better than implicit' and all that.

And I know we said we should discuss the footprint stuff elsewhere, but just whilst we're here: what exactly is it!? How is it different from a bounding box? Can a footprint be non-rectangular? Does it literally express the precise extent of a raster minus the alpha band? If so, how accurate should it be? How should it be calculated?

@mojodna
Copy link
Contributor

mojodna commented Apr 27, 2017

Here's a sample footprint generated by the tile prep workflow: https://oin-hotosm.s3.amazonaws.com/58655b07f91c99bd00e9c7ab/0/58655decb0eae7f3b143a908_footprint.json

It can be non-rectangular and shows the image's footprint after it's been masked out, with nodata and otherwise. This samples at 1% (?) to keep the GeoJSON size down, since the end result is stair-stepped no matter what (because it's polygonizing pixels).

@tombh
Copy link
Member

tombh commented Apr 27, 2017

Exactly what I needed to know, thanks. So it lives up to its name and is certainly different from a bounding box. Ok we can continue to talk about this in a separate thread ... #28

And what should we do about the 4326 discussion? There are actually so many ongoing issues about spec updates. We really need to triage and perhaps consolidate into a single issue, that maybe has the free reign to trigger a major version change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants