-
Notifications
You must be signed in to change notification settings - Fork 33
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
79c49bf
commit 43302fe
Showing
5 changed files
with
120 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# Developer Documentation | ||
|
||
## Python | ||
|
||
This project uses [Poetry](https://python-poetry.org/) to manage Python dependencies. | ||
|
||
After installing Poetry, run | ||
|
||
``` | ||
poetry install | ||
``` | ||
|
||
to install all dependencies. | ||
|
||
To register the current Poetry-managed Python environment with JupyterLab, run | ||
|
||
``` | ||
poetry run python -m ipykernel install --user --name "lonboard" | ||
``` | ||
|
||
JupyterLab is an included dev dependency, so to start JupyterLab you can run | ||
|
||
``` | ||
poetry run jupyter lab | ||
``` | ||
|
||
Then you should see a tile on the home screen that lets you open a Jupyter Notebook in the `lonboard` environment. You should also be able to open up an example notebook from the `examples/` folder. | ||
|
||
## JavaScript | ||
|
||
The JavaScript dependencies are managed in `package.json` and tracked with Yarn or NPM (I haven't been consistent at using one or the other :sweat_smile:). | ||
|
||
ESBuild is used for bundling into an ES Module that the Jupyter Widget loads at runtime. The ESBuild configuration is in `build.mjs`. You can run the script with | ||
|
||
``` | ||
yarn build | ||
``` | ||
|
||
I often run | ||
|
||
``` | ||
fswatch -o src | xargs -n1 -I{} yarn build | ||
``` | ||
|
||
to watch the `src` directory and run `yarn build` anytime it changes. | ||
|
||
Currently, each Python model (the `ScatterplotLayer`, `PathLayer`, and `SolidPolygonLayer` classes) use _their own individual JS entry points_. You can inspect this with the `_esm` key on each class, which is used by anywidget to load in the widget. The ESBuild script converts `scatterplot-layer.tsx`, `path-layer.tsx`, and `solid-polygon-layer.tsx` into bundles used by each class, respectively. | ||
|
||
Anywidget and its dependency ipywidgets handles the serialization from Python into JS, automatically keeping each side in sync. | ||
|
||
## Documentation website | ||
|
||
The documentation website is generated with `mkdocs` and [`mkdocs-material`](https://squidfunk.github.io/mkdocs-material). After `poetry install`, you can serve the docs website locally with | ||
|
||
``` | ||
poetry run mkdocs serve | ||
``` | ||
|
||
and you can publish the docs to Github Pages with | ||
|
||
``` | ||
poetry run mkdocs gh-deploy | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# lonboard |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Performance | ||
|
||
Performance is a critical goal of lonboard. Below are a couple pieces of information you should know to understand lonboard's performance characteristics, as well as some advice for how to get the best performance. | ||
|
||
## Performance Characteristics | ||
|
||
There are two distinct parts to the performance of **lonboard**: one is the performance of transferring data to the browser and the other is the performance of rendering the data once it's there. | ||
|
||
In general, these parts are completely distinct. Even if it takes a while to load the data in your browser, the map might be snappy once it loads, and vice versa. | ||
|
||
### Data Transfer | ||
|
||
Lonboard creates an interactive visualization of your data in your browser. In order to do this, your GeoDataFrame needs to be transferred from your Python environment to your browser. | ||
|
||
In the case where your Python session is running locally (on the same machine as your browser), this data transfer is extremely fast: less than a second in most cases. | ||
|
||
However, in the case where your Python session is running on a remote server (such as [Google Colab](https://colab.research.google.com/), [Binder](https://mybinder.readthedocs.io/en/latest/introduction.html), or a JupyterHub instance), this data transfer means **downloading the data to your local browser**. Therefore, when running lonboard from a remote server, your internet speed and the quantity of data you pass into a layer will have large impacts on the data transfer speed. | ||
|
||
Under the hood, lonboard uses efficient compression (in the form of [GeoParquet](https://geoparquet.org/)) to transfer data to the browser, but compression can only do so much; the data still needs to be downloaded. | ||
|
||
### Rendering Performance | ||
|
||
Once the data has been transfered from your Python session to your browser, it needs to be rendered. | ||
|
||
The biggest thing to note is that — in contrast to projects like [datashader](https://datashader.org/) — lonboard **does not minimize the amount of data being rendered**. If you pass a GeoDataFrame with 10 million coordinates, lonboard will attempt to render all 10 million coordinates at once. | ||
|
||
The primary determinant of the maximum amount of data you can render with lonboard is your computer's hardware. Via the underlying [deck.gl](https://deck.gl/) library, lonboard ultimately renders geometries using your computer's Graphics Processing Unit (GPU). If you have a better GPU card, you'll be able to visualize more data. | ||
|
||
Lonboard is more efficient at rendering than previous libraries, but there will always be _some quantity of data_ beyond which your browser tab is likely to crash while attempting to render. Testing on a recent MacBook Pro M2 computer, lonboard has been able to render a few million points with minimal lag. | ||
|
||
## Performance Advice | ||
|
||
### Use a local Python session | ||
|
||
Moving from a remote Python environment to a local Python environment is often impractical, but this change will make it much faster to visualize data, especially over slow internet connections. | ||
|
||
### Remove columns before rendering | ||
|
||
All columns included in the `GeoDataFrame` will be transferred to the browser for visualization. (In the future, these other columns will be used to display a tooltip when hovering over/clicking on a geometry.) | ||
|
||
Especially in the case of a remote Python session, excluding unnecessary attribute columns will make data transfer to the browser faster. | ||
|
||
### Use Arrow-based data types in Pandas | ||
|
||
As of Pandas 2.0, Pandas supports two backends for data types: either the original numpy-based data types or new data types based on Arrow and the pyarrow library. | ||
|
||
The first thing that lonboard does when visualizing data is converting from Pandas to an Arrow representation. Any non-geometry attribute columns will be converted to Arrow, so if you're using Arrow-based data types in Pandas already, this step will be "free" as no conversion is needed. | ||
|
||
See the pandas [guide on data types](https://pandas.pydata.org/docs/user_guide/pyarrow.html) and the [`pandas.ArrowDtype` class](https://pandas.pydata.org/docs/reference/api/pandas.ArrowDtype.html). | ||
|
||
### Simplify geometries before rendering | ||
|
||
Simplifying geometries before rendering reduces the total number of coordinates and can make a visualization snappier. At this point, lonboard does not offer built-in geometry simplification. This is something you would need to do before passing data to lonboard. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters