Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/5905 optimze spatial cross land #141

Merged
merged 18 commits into from
Sep 23, 2024

Conversation

utas-raymondng
Copy link
Contributor

@utas-raymondng utas-raymondng commented Sep 22, 2024

Introduce a summaries field call centroid, which calculate the centrold of the spatial extents by remove the land from the spatial extents and then cut it into grid.

The centroid is calculate based on each grid, without the land the centroid is always within the sea.

After break down into grid the same spatial extents will have multiple centroid that belongs to same extents. It is up to the UI on how to use it.

…ss-land

# Conflicts:
#	indexer/src/test/resources/canned/abstract_resposibilty_null_stac.json
#	indexer/src/test/resources/canned/keywords_null_stac.json
#	indexer/src/test/resources/canned/sample10_stac.json
#	indexer/src/test/resources/canned/sample11_stac.json
#	indexer/src/test/resources/canned/sample12_stac.json
#	indexer/src/test/resources/canned/sample13_stac.json
#	indexer/src/test/resources/canned/sample14_stac.json
#	indexer/src/test/resources/canned/sample15_stac.json
#	indexer/src/test/resources/canned/sample4_stac.json
#	indexer/src/test/resources/canned/sample6_stac.json
#	indexer/src/test/resources/canned/sample7_stac_no_es.json
#	indexer/src/test/resources/canned/sample9_stac.json
#	indexer/src/test/resources/canned/sample_incorrect_projection_stac.json
#	indexer/src/test/resources/canned/sample_multiple_temporal1_stac.json
#	indexer/src/test/resources/canned/sample_multiple_temporal2_stac.json
#	indexer/src/test/resources/canned/sample_multiple_temporal_null_stac.json
Comment on lines 38 to 72
static {
String tempDir = System.getProperty("java.io.tmpdir");
ClassPathResource resource1 = new ClassPathResource("land/ne_10m_land.shp");
ClassPathResource resource2 = new ClassPathResource("land/ne_10m_land.shx");

try(InputStream input = resource2.getInputStream()) {
File tempFile = new File(tempDir, "shapefile.shx");
tempFile.deleteOnExit(); // Ensure the file is deleted when the JVM exits

// Write the InputStream to the temporary file
try (FileOutputStream outputStream = new FileOutputStream(tempFile)) {
byte[] buffer = new byte[1024];
int bytesRead;
while ((bytesRead = input.read(buffer)) != -1) {
outputStream.write(buffer, 0, bytesRead);
}
}
} catch (IOException e) {
throw new RuntimeException(e);
}

try(InputStream input = resource1.getInputStream()) {
// Create a temporary file to store the shapefile, because ShapefileDataStoreFactory do not support
// input stream.
File tempFile = new File(tempDir, "shapefile.shp");
tempFile.deleteOnExit(); // Ensure the file is deleted when the JVM exits

// Write the InputStream to the temporary file
try (FileOutputStream outputStream = new FileOutputStream(tempFile)) {
byte[] buffer = new byte[1024];
int bytesRead;
while ((bytesRead = input.read(buffer)) != -1) {
outputStream.write(buffer, 0, bytesRead);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this area looks like duplicate code, can you create a private method to create temp file from resource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 92 to 97
// Union the land geometries (since land shapefile may contain multiple features)
if (land == null) {
land = simplifiedGeometry;
} else {
land = land.union(simplifiedGeometry);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this rolling style union operation maybe slow with large shapes, can you collect all the geometries first to List<Geometry> geometries and perform bulk union operation later using dedicated external library that optimises for this kind of operation: https://locationtech.github.io/jts/javadoc/org/locationtech/jts/operation/union/UnaryUnionOp.html

By using this special-purpose operation over a collection of geometries it is possible to take advantage of various optimizations to improve performance.

jts is a popular library, 1.9k stars.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok looks like you already installed jts.

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;

public class GeometryUtilsTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think should have test cases for important methods separately, likely the ones with protected attribute. this test is quite general given there are quite a few in between steps, e.g breakLargeGeometryToGrid and removeLandAreaFromGeometry, and (not just this) test specific scenarios to guarantee this kind of problem (understood that the utils will use interior point if centroid is outside the polygons, but need tests).

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

* @param cellSize - How big each cell will be
* @return - List of polygon that store the grid
*/
protected static List<Polygon> createGridPolygons(Envelope envelope, double cellSize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you have TODO to plan removing hard-coded coordinate system, this method should have the 3rd param int srid? so when creating a geometry factory, can use GeometryFactory geometryFactory = new GeometryFactory(new PrecisionModel(), srid);

Copy link
Contributor Author

@utas-raymondng utas-raymondng Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the TODO , in fact it should always pass in CRS84, the reason is because the findPolygonFrom --> findPolygonsFromEXBoundingPolygonType will read the coordinate ref from XML and do the transform

String projection = polygonType.getSrsName();
MathTransform transform = null;
if(projection != null && !projection.isBlank()) {
    try {
        transform = CRS.findMathTransform(CRS.decode(projection), CRS84);
    }
    catch (FactoryException e) {
        // Default is WGS84 (CRS84), so do nothing need to transform
    }
}

the other function findPolygonsFromEXGeographicBoundingBoxType seems refer to bounding box which seems already construct correctly and no getSrs function

Copy link
Contributor

@vietnguyengit vietnguyengit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @utas-raymondng

@vietnguyengit vietnguyengit merged commit 9c0073f into main Sep 23, 2024
2 checks passed
@vietnguyengit vietnguyengit deleted the features/5905-optimze-spatial-cross-land branch September 23, 2024 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants