Skip to content

Commit

Permalink
Bump to parquet-wasm 0.6.0 (#477)
Browse files Browse the repository at this point in the history
Thinking about this a little more, because we serialize each arrow batch
as its _own_ Parquet file, we probably don't have a ton of memory
overhead because we can presumably reuse the previous batch's memory
space for the next batch.

I really don't know how reliable arrow-js-ffi is right now, so I'll hold
off on adopting that in Lonboard. It's not worth the stability risk at
the moment.

~~Using arrow-js-ffi should significantly improve memory usage in the
browser, as we no longer need to make full copies on the wasm side.~~

~~We're never going to find out how stable arrow-js-ffi truly is until
we jump off into the deep end 😄~~
  • Loading branch information
kylebarron authored Apr 22, 2024
1 parent 2d7e98f commit b2110b8
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 13 deletions.
8 changes: 4 additions & 4 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"@geoarrow/deck.gl-layers": "^0.3.0-beta.16",
"apache-arrow": "^15.0.2",
"maplibre-gl": "^3.6.2",
"parquet-wasm": "0.5.0",
"parquet-wasm": "0.6.0",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"react-map-gl": "^7.1.7",
Expand Down
15 changes: 7 additions & 8 deletions src/parquet.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
import { useEffect, useState } from "react";
import _initParquetWasm, { readParquet } from "parquet-wasm/esm/arrow2";
import _initParquetWasm, { readParquet } from "parquet-wasm";
import * as arrow from "apache-arrow";

// NOTE: this version must be synced exactly with the parquet-wasm version in
// use.
const PARQUET_WASM_VERSION = "0.5.0";
const PARQUET_WASM_CDN_URL = `https://cdn.jsdelivr.net/npm/parquet-wasm@${PARQUET_WASM_VERSION}/esm/arrow2_bg.wasm`;
const PARQUET_WASM_VERSION = "0.6.0";
const PARQUET_WASM_CDN_URL = `https://cdn.jsdelivr.net/npm/parquet-wasm@${PARQUET_WASM_VERSION}/esm/parquet_wasm_bg.wasm`;
let WASM_READY: boolean = false;

export async function initParquetWasm() {
Expand All @@ -27,10 +26,10 @@ export function parseParquet(dataView: DataView): arrow.Table {

console.time("readParquet");

// TODO: use arrow-js-ffi for more memory-efficient wasm --> js transfer
const arrowIPCBuffer = readParquet(
new Uint8Array(dataView.buffer),
).intoIPCStream();
// TODO: use arrow-js-ffi for more memory-efficient wasm --> js transfer?
const arrowIPCBuffer = readParquet(new Uint8Array(dataView.buffer), {
batchSize: Math.pow(2, 31),
}).intoIPCStream();
const arrowTable = arrow.tableFromIPC(arrowIPCBuffer);

console.timeEnd("readParquet");
Expand Down

0 comments on commit b2110b8

Please sign in to comment.