Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Support WKB geometry column in CSV #2312

Merged
merged 2 commits into from
Sep 18, 2023
Merged

Conversation

lixun910
Copy link
Collaborator

In Kepler.gl, geometries (Polygons, Points, LindStrings etc) can be embedded into CSV as a GeoJSON or WKT formatted string (see doc here). This PR is to support WKB geometry column in CSV file.

For example, the following CSV row contains a WKT geometry:

id,geom
2,POINT(1 2)

the following CSV row contains a WKB geometry in hex format:

id, geom
1,0101000000000000000000F03F0000000000000040

(In PostGIS, the WKB in hex is the default format of the output of the geometry column, unless using e.g. ST_AsWKT)

  1. The WKB format is more efficient than the WKT for processing geometry objects (no parsing needed).
  2. The WKB format keeps the precisions of the raw geometry data
  3. The size of WKB in hex format (2 times the size of WKB) is usually smaller than WKT with complex geometries (linestrings, polygons etc. -- not for simple points)
  • guerry.geojson is 2 MB
  • guerry_wkt.csv is 1.7 MB
  • guerry_wkb.csv is 1.4 MB
  • guerry.arrow is 705 KB

For Arrow/GeoArrow, there is another wip PR to support Arrow/GeoArrow in Kepler.gl: GeoDaCenter#2

Copy link
Collaborator

@ibgreen ibgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for contribution!

if (!parsedGeo) {
try {
const buffer = Buffer.from(geoString, 'hex');
const binaryGeo = parseSync(buffer, WKBLoader);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. We should probably just use the WKTLoader from loaders.gl/wkt for the wkt parsing on line 138. It is based on wellknown so it should be equivalent but keeping consistent API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Will try to use WKTLoader instead of wktParser from wellknown package. Thanks!

// try parse as wkb using loaders.gl WKBLoader
if (!parsedGeo) {
try {
const buffer = Buffer.from(geoString, 'hex');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Wish we had a loader for this so we didn't need to mess with Node.js Buffer class. I will give it some thought.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Ib will address this in visgl/loaders.gl#2652

* @param str input string
* @returns true if string is a valid WKB in HEX format
*/
export function isHexWkb(str: string | null): boolean {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. I wonder how to handle HexWKB in loaders.gl. Maybe just add HexWKBLoader / HexWKBWriter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HexWKBLoader/Writer sounds great 👍

@@ -21,7 +21,7 @@
"strict": true,
"resolveJsonModule": true,
"isolatedModules": true,
"baseUrl": "./src",
"baseUrl": ".",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? Looks unrelated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to WKB, but it seems not been set correctly and causing the alias "@kepler.gl/*" not mapping to correct source code. (see below @kepler.gl/*": ["src/*/src"])

Copy link
Collaborator Author

@lixun910 lixun910 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments, @ibgreen!

if (!parsedGeo) {
try {
const buffer = Buffer.from(geoString, 'hex');
const binaryGeo = parseSync(buffer, WKBLoader);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Will try to use WKTLoader instead of wktParser from wellknown package. Thanks!

* @param str input string
* @returns true if string is a valid WKB in HEX format
*/
export function isHexWkb(str: string | null): boolean {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HexWKBLoader/Writer sounds great 👍

@lixun910
Copy link
Collaborator Author

@ibgreen added new functions e.g. HexWKBLoader that can be used to directly parse the HexWKB column using loaders.gl. See PR here. Thanks, @ibgreen! He also plans to backport this PR to 3.4 and make a patch.

Let me know if merging this PR is OK, or if we should wait for the patch in loaders.gl v3. Thanks!

@ibgreen
Copy link
Collaborator

ibgreen commented Sep 18, 2023

Yes its fine to merge this now. It will take a while to issue a loaders 3.4 patch.

@lixun910 lixun910 merged commit d9c164b into keplergl:master Sep 18, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants