Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Anna Petrasova <[email protected]>
  • Loading branch information
echoix and petrasovaa authored Oct 23, 2024
1 parent 84dc11d commit fafd964
Show file tree
Hide file tree
Showing 2 changed files with 0 additions and 40 deletions.
1 change: 0 additions & 1 deletion .flake8
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,6 @@ per-file-ignores =
# Files not managed by Black
python/grass/imaging/images2gif.py: E226
# Unused imports in init files
# F401 imported but unused
# F403 star import used; unable to detect undefined names
python/grass/*/__init__.py: F401, F403
python/grass/*/*/__init__.py: F403
Expand Down
39 changes: 0 additions & 39 deletions doc/notebooks/parallelization_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
"cells": [
{
"cell_type": "markdown",
"id": "7fb27b941602401d91542211134fc71a",
"metadata": {},
"source": [
"# Introduction to Parallelization in GRASS GIS\n",
Expand All @@ -11,7 +10,6 @@
},
{
"cell_type": "markdown",
"id": "acae54e37e7d407bbb7b55eff062a284",
"metadata": {},
"source": [
"Let's start GRASS to run examples:"
Expand All @@ -20,7 +18,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "9a63283cbaf04dbcab1f6479b197f3a8",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -45,15 +42,13 @@
},
{
"cell_type": "markdown",
"id": "8dd0d8092fe74a7c96281538738b07e2",
"metadata": {},
"source": [
"Note: most examples assume we are already in an active GRASS session."
]
},
{
"cell_type": "markdown",
"id": "72eea5119410473aa328ad9291626812",
"metadata": {
"tags": []
},
Expand All @@ -80,7 +75,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "8edb47106e1a46a883d545849b8ab81b",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -91,15 +85,13 @@
},
{
"cell_type": "markdown",
"id": "10185d26023b46108eb7d9f57d49d2b3",
"metadata": {},
"source": [
"The speedup (processing time with 1 core / processing time with N cores) typically does not increase linearly with the number of cores and parallel efficiency (speedup / N cores) decreases when adding cores. See, e.g., [benchmarks for r.neighbors](https://grass.osgeo.org/grass-stable/manuals/r.neighbors.html#performance). This behavior is due to the serial parts of the code (see [Amdahl's law](https://en.wikipedia.org/wiki/Amdahl%27s_law)) and computation overhead. "
]
},
{
"cell_type": "markdown",
"id": "8763a12b2bbd4a93a75aff182afb95dc",
"metadata": {},
"source": [
"## Parallelization of workflows\n",
Expand All @@ -112,7 +104,6 @@
},
{
"cell_type": "markdown",
"id": "7623eae2785240b9bd12b16a66d81610",
"metadata": {},
"source": [
"### Data-based parallelization\n",
Expand All @@ -122,7 +113,6 @@
},
{
"cell_type": "markdown",
"id": "7cdc8c89c7104fffa095e18ddfef8986",
"metadata": {},
"source": [
"The following example shows IDW interpolation split into 4 tiles. In this case, specifying an overlap is needed to get correct results without edge artifacts. Here, the number and size of tiles is automatically derived from the number of cores, but can be specified."
Expand All @@ -131,7 +121,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "b118ea5561624da68c537baed56e602f",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -142,7 +131,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "938c804e27f84196a10c8828c723f798",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -168,7 +156,6 @@
},
{
"cell_type": "markdown",
"id": "504fb2a444614c0babb325280ed9130a",
"metadata": {},
"source": [
"The following is the same tool ran in serial:"
Expand All @@ -177,7 +164,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "59bbdb311c014d738909a11f9e486628",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -187,7 +173,6 @@
},
{
"cell_type": "markdown",
"id": "b43b363d81ae4b689946ece5c682cd59",
"metadata": {},
"source": [
"There are tools that already integrate tiling. For example, addon [r.mapcalc.tiled](https://grass.osgeo.org/grass-stable/manuals/addons/r.mapcalc.tiled.html) uses the tiling concept for raster algebra computation. More complex algebra expression will increase the speedup of this method."
Expand All @@ -196,7 +181,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "8a65eabff63a45729fe45fb5ade58bdc",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -211,7 +195,6 @@
},
{
"cell_type": "markdown",
"id": "c3933fab20d04ec698c2621248eb3be0",
"metadata": {},
"source": [
"### Task-based parallelization\n",
Expand All @@ -221,7 +204,6 @@
},
{
"cell_type": "markdown",
"id": "4dd4641cc4064e0191573fe9c69df29b",
"metadata": {},
"source": [
"#### Examples in Python\n",
Expand All @@ -230,7 +212,6 @@
},
{
"cell_type": "markdown",
"id": "8309879909854d7188b41380fd92a7c3",
"metadata": {},
"source": [
"In the following example viewsheds from different coordinates are computed in parallel using `multiprocessing.Pool` class. To avoid issues when using multiprocessing from Jupyter Notebook (multiprocessing.Pool does not work with interactive interpreters), we will first write a Python script with main function and then execute it."
Expand All @@ -239,7 +220,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "3ed186c9a28b402fb0bc4494df01f08d",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -266,7 +246,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "cb1e1581032b452c9409d6c6813c49d1",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -277,7 +256,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "379cbbc1e968416e875cc15c1202d7eb",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -286,7 +264,6 @@
},
{
"cell_type": "markdown",
"id": "277c27b1587741f2af2001be3712ef0d",
"metadata": {},
"source": [
"#### Examples in Bash\n",
Expand All @@ -296,7 +273,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "db7b79bc585a40fcaf58bf750017e135",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -309,7 +285,6 @@
},
{
"cell_type": "markdown",
"id": "916684f9a58a4a2aa5f864670399430d",
"metadata": {
"tags": []
},
Expand All @@ -322,7 +297,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "1671c31a24314836a5b85d7ef7fbf015",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -336,23 +310,20 @@
},
{
"cell_type": "markdown",
"id": "33b0902fd34d4ace834912fa1002cf8e",
"metadata": {},
"source": [
"See manual pages of GNU Parallel or xargs for more advanced uses. GNU Parallel can be configured to distribute jobs across multiple machines. In that case, use `--exec` interface described below."
]
},
{
"cell_type": "markdown",
"id": "f6fa52606d8c4a75a9b52967216f8f3f",
"metadata": {},
"source": [
"### Safe execution of parallel tasks"
]
},
{
"cell_type": "markdown",
"id": "f5a1fa73e5044315a093ec459c9be902",
"metadata": {},
"source": [
"While you can execute tasks in parallel within a single mapset, it is *not safe* when your tasks:\n",
Expand All @@ -368,7 +339,6 @@
},
{
"cell_type": "markdown",
"id": "cdf66aed5cc84ca1b48e60bad68798a8",
"metadata": {},
"source": [
"#### Executing processes in separate mapsets\n",
Expand All @@ -382,7 +352,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "28d3efd5258a48a79c179ea5c6759f01",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -392,7 +361,6 @@
},
{
"cell_type": "markdown",
"id": "3f9bc0b9dd2c44919cc8dcca39b469f8",
"metadata": {},
"source": [
"One of the previous examples that was running within GRASS session in a single mapset can be rewritten so that each task runs in a newly created mapset. Note that by default newly created mapsets use default computational region for that GRASS location (you can use `g.region -s` to modify it). For raster computations, you need to change the computational region for each new mapset if the default one is not desired."
Expand All @@ -401,7 +369,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "0e382214b5f147d187d36a2058b9c724",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -418,7 +385,6 @@
},
{
"cell_type": "markdown",
"id": "5b09d5ef5b5e4bb6ab9b829b10b6a29f",
"metadata": {},
"source": [
"In some cases, only a temporary mapset or location is needed, see [examples](https://grass.osgeo.org/grass-stable/manuals/grass.html#batch-jobs-with-the-exec-interface).\n",
Expand All @@ -427,7 +393,6 @@
},
{
"cell_type": "markdown",
"id": "a50416e276a0479cbe66534ed1713a40",
"metadata": {},
"source": [
"#### Safely modifying computational region in a single mapset\n",
Expand All @@ -440,7 +405,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "46a27a456b804aa2a380d5edf15a5daf",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -470,7 +434,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "1944c39560714e6e80c856f20744a8e5",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -479,15 +442,13 @@
},
{
"cell_type": "markdown",
"id": "d6ca27006b894b04b6fc8b79396e2797",
"metadata": {},
"source": [
"#### Safely modifying vectors with attributes in a single mapset"
]
},
{
"cell_type": "markdown",
"id": "f61877af4e7f4313ad8234302950b331",
"metadata": {},
"source": [
"By default vector maps share a single SQLite database file, however SQLite does not support concurrent write access. That poses a problem when modifying vectors with attributes in parallel. While this can be solved by running the computations in separate mapsets, it is also possible to change the default behavior to write attributes of each vector to the vector's individual SQLite file. This behavior can be activated after a new mapset is created with:\n",
Expand Down

0 comments on commit fafd964

Please sign in to comment.