From b06e1a6d48a3348008e48a7cb3aef7462d8e3f2b Mon Sep 17 00:00:00 2001 From: Bingqing Liu Date: Sun, 3 Nov 2024 21:22:40 -0600 Subject: [PATCH] update --- _toc.yml | 1 + book/labs/06_functions_packages.ipynb | 927 ---------------------- book/labs/lab3b copy.ipynb | 1037 +++++++++++++++++++++++++ book/lectures/lecture12.md | 3 + 4 files changed, 1041 insertions(+), 927 deletions(-) delete mode 100644 book/labs/06_functions_packages.ipynb create mode 100644 book/labs/lab3b copy.ipynb create mode 100644 book/lectures/lecture12.md diff --git a/_toc.yml b/_toc.yml index d76ba97..0f60f56 100644 --- a/_toc.yml +++ b/_toc.yml @@ -21,6 +21,7 @@ parts: - file: book/lectures/lecture08 - file: book/lectures/lecture09 - file: book/lectures/lecture11 + - file: book/lectures/lecture12 - caption: Labs chapters: - file: book/labs/lab1a diff --git a/book/labs/06_functions_packages.ipynb b/book/labs/06_functions_packages.ipynb deleted file mode 100644 index 4efcbe2..0000000 --- a/book/labs/06_functions_packages.ipynb +++ /dev/null @@ -1,927 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "0", - "metadata": {}, - "source": [ - "# Functions and Packages\n", - "\n", - "## Overview\n", - "\n", - "1. This lecture first introduces the concepts of functions in Python, focusing on their application in geospatial programming. Functions allow you to encapsulate code into reusable blocks, making your scripts more modular and easier to maintain. \n", - "\n", - "2. This lecture introduces [NumPy](https://numpy.org) and [Pandas](https://pandas.pydata.org), two fundamental libraries for data manipulation and analysis in Python, with applications in geospatial programming. `NumPy` is essential for numerical operations and handling arrays, while `Pandas` provides powerful tools for data analysis, particularly when working with tabular data. Understanding these libraries will enable you to perform complex data operations efficiently and effectively in geospatial contexts.\n", - "\n", - "## Learning Objectives\n", - "\n", - "By the end of this lecture, you should be able to:\n", - "\n", - "- Define and use functions to perform specific tasks and promote code reuse in geospatial applications.\n", - "- Understand the basics of `NumPy` arrays and how to perform operations on them.\n", - "- Utilize `Pandas` DataFrames to organize, analyze, and manipulate tabular data.\n", - "- Apply `NumPy` and `Pandas` in geospatial programming to process and analyze geospatial datasets.\n", - "- Combine `NumPy` and `Pandas` to streamline data processing workflows.\n", - "- Develop the ability to perform complex data operations, such as filtering, aggregating, and transforming geospatial data." - ] - }, - { - "cell_type": "markdown", - "id": "1", - "metadata": {}, - "source": [ - "## Functions\n", - "\n", - "Functions are blocks of code that perform a specific task and can be reused multiple times. They allow you to structure your code more efficiently and reduce redundancy.\n", - "\n", - "### Defining a Simple Function\n", - "\n", - "Here's a simple function that adds two numbers:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2", - "metadata": {}, - "outputs": [], - "source": [ - "def add(a, b):\n", - " return a + b\n", - "\n", - "\n", - "# Example usage\n", - "result = add(5, 3)\n", - "print(f\"Result: {result}\")" - ] - }, - { - "cell_type": "markdown", - "id": "3", - "metadata": {}, - "source": [ - "This function takes two parameters `a` and `b`, and returns their sum. You can call it by passing two values as arguments." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6", - "metadata": {}, - "outputs": [], - "source": [ - "# Function to multiply two numbers\n", - "def multiply(a, b):\n", - " return a * b\n", - "\n", - "\n", - "# Calling the function\n", - "result = multiply(4, 5)\n", - "print(f\"Multiplication Result: {result}\")" - ] - }, - { - "cell_type": "markdown", - "id": "7", - "metadata": {}, - "source": [ - "You can call the multiply function with two numbers, and it will return their product.\n", - "\n", - "### Geospatial Example: Haversine Function\n", - "\n", - "Let's apply these concepts to a geospatial problem. The [Haversine formula](https://en.wikipedia.org/wiki/Haversine_formula) calculates the distance between two points on the Earth’s surface.\n", - "\n", - "![](https://upload.wikimedia.org/wikipedia/commons/c/cb/Illustration_of_great-circle_distance.svg)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8", - "metadata": {}, - "outputs": [], - "source": [ - "from math import radians, sin, cos, sqrt, atan2" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9", - "metadata": {}, - "outputs": [], - "source": [ - "def haversine(lat1, lon1, lat2, lon2):\n", - " R = 6371.0 # Earth radius in kilometers\n", - " dlat = radians(lat2 - lat1)\n", - " dlon = radians(lon2 - lon1)\n", - " a = (\n", - " sin(dlat / 2) ** 2\n", - " + cos(radians(lat1)) * cos(radians(lat2)) * sin(dlon / 2) ** 2\n", - " )\n", - " c = 2 * atan2(sqrt(a), sqrt(1 - a))\n", - " distance = R * c\n", - " return distance\n", - "\n", - "\n", - "# Example usage\n", - "distance = haversine(35.6895, 139.6917, 34.0522, -118.2437)\n", - "print(f\"Distance: {distance:.2f} km\")" - ] - }, - { - "cell_type": "markdown", - "id": "12", - "metadata": {}, - "source": [ - "Now, let's create a function that takes a list of coordinate pairs and returns a list of distances between consecutive points." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "13", - "metadata": {}, - "outputs": [], - "source": [ - "def batch_haversine(coord_list):\n", - " distances = []\n", - " for i in range(len(coord_list) - 1):\n", - " lat1, lon1 = coord_list[i]\n", - " lat2, lon2 = coord_list[i + 1]\n", - " distance = haversine(lat1, lon1, lat2, lon2)\n", - " distances.append(distance)\n", - " return distances\n", - "\n", - "\n", - "# Example usage\n", - "coordinates = [(35.6895, 139.6917), (34.0522, -118.2437), (40.7128, -74.0060)]\n", - "distances = batch_haversine(coordinates)\n", - "print(f\"Distances: {distances}\")" - ] - }, - { - "cell_type": "markdown", - "id": "80bb4fed", - "metadata": {}, - "source": [ - "## Introduction to NumPy\n", - "\n", - "`NumPy` (Numerical Python) is a library used for scientific computing. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.\n", - "\n", - "### Creating NumPy Arrays\n", - "\n", - "Let's start by creating some basic `NumPy` arrays." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "7c6d672a", - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "8c417d9f", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "1D Array: [1 2 3 4 5]\n" - ] - } - ], - "source": [ - "# Creating a 1D array\n", - "arr_1d = np.array([1, 2, 3, 4, 5])\n", - "print(f\"1D Array: {arr_1d}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "4781749f", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "2D Array:\n", - "[[1 2 3]\n", - " [4 5 6]]\n" - ] - } - ], - "source": [ - "# Creating a 2D array\n", - "arr_2d = np.array([[1, 2, 3], [4, 5, 6]])\n", - "print(f\"2D Array:\\n{arr_2d}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "f074fe25", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Array of zeros:\n", - "[[0. 0. 0.]\n", - " [0. 0. 0.]\n", - " [0. 0. 0.]]\n" - ] - } - ], - "source": [ - "# Creating an array of zeros\n", - "zeros = np.zeros((3, 3))\n", - "print(f\"Array of zeros:\\n{zeros}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "585b061b", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Array of ones:\n", - "[[1. 1. 1. 1.]\n", - " [1. 1. 1. 1.]]\n" - ] - } - ], - "source": [ - "# Creating an array of ones\n", - "ones = np.ones((2, 4))\n", - "print(f\"Array of ones:\\n{ones}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "ba820dd1", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Range Array: [0 2 4 6 8]\n" - ] - } - ], - "source": [ - "# Creating an array with a range of values\n", - "range_arr = np.arange(0, 10, 2)\n", - "print(f\"Range Array: {range_arr}\")" - ] - }, - { - "cell_type": "markdown", - "id": "57a59a46", - "metadata": {}, - "source": [ - "### Basic Array Operations\n", - "\n", - "`NumPy` allows you to perform element-wise operations on arrays." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e2e6008a", - "metadata": {}, - "outputs": [], - "source": [ - "# Array addition\n", - "arr_sum = arr_1d + 10\n", - "print(f\"Array after addition: {arr_sum}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "90801734", - "metadata": {}, - "outputs": [], - "source": [ - "# Array multiplication\n", - "arr_product = arr_1d * 2\n", - "print(f\"Array after multiplication: {arr_product}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "58508790", - "metadata": {}, - "outputs": [], - "source": [ - "# Element-wise multiplication of two arrays\n", - "arr_2d_product = arr_2d * np.array([1, 2, 3])\n", - "print(f\"Element-wise multiplication of 2D array:\\n{arr_2d_product}\")" - ] - }, - { - "cell_type": "markdown", - "id": "6523bb2d", - "metadata": {}, - "source": [ - "### Working with Geospatial Coordinates\n", - "\n", - "You can use `NumPy` to perform calculations on arrays of geospatial coordinates, such as converting from degrees to radians." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "1c2d7667", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Coordinates in radians:\n", - "[[ 6.22899283e-01 2.43808010e+00]\n", - " [ 5.94323008e-01 -2.06374188e+00]\n", - " [ 8.98973719e-01 -2.23053078e-03]]\n" - ] - } - ], - "source": [ - "# Array of latitudes and longitudes\n", - "coords = np.array([[35.6895, 139.6917], [34.0522, -118.2437], [51.5074, -0.1278]])\n", - "\n", - "# Convert degrees to radians\n", - "coords_radians = np.radians(coords)\n", - "print(f\"Coordinates in radians:\\n{coords_radians}\")" - ] - }, - { - "cell_type": "markdown", - "id": "b0d33eda", - "metadata": {}, - "source": [ - "## Introduction to Pandas\n", - "\n", - "`Pandas` is a powerful data manipulation library that provides data structures like Series and DataFrames to work with structured data. It is especially useful for handling tabular data.\n", - "\n", - "### Creating Pandas Series and DataFrames\n", - "\n", - "Let's create a `Pandas` Series and DataFrame." - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "f50215e9", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "id": "aa823925", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Pandas Series:\n", - "0 Tokyo\n", - "1 Los Angeles\n", - "2 London\n", - "Name: City, dtype: object\n", - "\n" - ] - } - ], - "source": [ - "# Creating a Series\n", - "city_series = pd.Series([\"Tokyo\", \"Los Angeles\", \"London\"], name=\"City\")\n", - "print(f\"Pandas Series:\\n{city_series}\\n\")" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "id": "35a3842e", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Pandas DataFrame:\n", - " City Latitude Longitude\n", - "0 Tokyo 35.6895 139.6917\n", - "1 Los Angeles 34.0522 -118.2437\n", - "2 London 51.5074 -0.1278\n" - ] - } - ], - "source": [ - "# Creating a DataFrame\n", - "data = {\n", - " \"City\": [\"Tokyo\", \"Los Angeles\", \"London\"],\n", - " \"Latitude\": [35.6895, 34.0522, 51.5074],\n", - " \"Longitude\": [139.6917, -118.2437, -0.1278],\n", - "}\n", - "df = pd.DataFrame(data)\n", - "print(f\"Pandas DataFrame:\\n{df}\")" - ] - }, - { - "cell_type": "markdown", - "id": "c75ef708", - "metadata": {}, - "source": [ - "### Basic DataFrame Operations\n", - "\n", - "You can perform various operations on `Pandas` DataFrames, such as filtering, selecting specific columns, and applying functions." - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "id": "c8381c17", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Latitudes:\n", - "0 35.6895\n", - "1 34.0522\n", - "2 51.5074\n", - "Name: Latitude, dtype: float64\n", - "\n" - ] - } - ], - "source": [ - "# Selecting a specific column\n", - "latitudes = df[\"Latitude\"]\n", - "print(f\"Latitudes:\\n{latitudes}\\n\")" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "id": "a0b05a9c", - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
CityLatitudeLongitude
1Los Angeles34.0522-118.2437
2London51.5074-0.1278
\n", - "
" - ], - "text/plain": [ - " City Latitude Longitude\n", - "1 Los Angeles 34.0522 -118.2437\n", - "2 London 51.5074 -0.1278" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Filtering rows based on a condition\n", - "df_filtered = df[df[\"Longitude\"] < 0]\n", - "df_filtered" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "id": "74d8950c", - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
CityLatitudeLongitudeLat_Radians
0Tokyo35.6895139.69170.622899
1Los Angeles34.0522-118.24370.594323
2London51.5074-0.12780.898974
\n", - "
" - ], - "text/plain": [ - " City Latitude Longitude Lat_Radians\n", - "0 Tokyo 35.6895 139.6917 0.622899\n", - "1 Los Angeles 34.0522 -118.2437 0.594323\n", - "2 London 51.5074 -0.1278 0.898974" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Adding a new column with a calculation\n", - "df[\"Lat_Radians\"] = np.radians(df[\"Latitude\"])\n", - "df" - ] - }, - { - "cell_type": "markdown", - "id": "55ebc2a5", - "metadata": {}, - "source": [ - "## Combining NumPy and Pandas\n", - "\n", - "You can combine `NumPy` and `Pandas` to perform complex data manipulations. For instance, you might want to apply `NumPy` functions to a `Pandas` DataFrame or use `Pandas` to organize and visualize the results of `NumPy` operations.\n", - "\n", - "Let's say you have a dataset of cities, and you want to calculate the average distance from each city to all other cities." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "id": "9ccacf8a", - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
City1City2Lat1Lon1Lat2Lon2Distance_km
0TokyoLos Angeles35.6895139.691734.0522-118.24378815.473356
1TokyoLondon35.6895139.691751.5074-0.12789558.713695
2Los AngelesLondon34.0522-118.243751.5074-0.12788755.602341
\n", - "
" - ], - "text/plain": [ - " City1 City2 Lat1 Lon1 Lat2 Lon2 Distance_km\n", - "0 Tokyo Los Angeles 35.6895 139.6917 34.0522 -118.2437 8815.473356\n", - "1 Tokyo London 35.6895 139.6917 51.5074 -0.1278 9558.713695\n", - "2 Los Angeles London 34.0522 -118.2437 51.5074 -0.1278 8755.602341" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Define the Haversine formula using NumPy\n", - "def haversine_np(lat1, lon1, lat2, lon2):\n", - " R = 6371.0 # Earth radius in kilometers\n", - " dlat = np.radians(lat2 - lat1)\n", - " dlon = np.radians(lon2 - lon1)\n", - " a = (\n", - " np.sin(dlat / 2) ** 2\n", - " + np.cos(np.radians(lat1)) * np.cos(np.radians(lat2)) * np.sin(dlon / 2) ** 2\n", - " )\n", - " c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1 - a))\n", - " distance = R * c\n", - " return distance\n", - "\n", - "\n", - "# Create a new DataFrame with city pairs\n", - "city_pairs = pd.DataFrame(\n", - " {\n", - " \"City1\": [\"Tokyo\", \"Tokyo\", \"Los Angeles\"],\n", - " \"City2\": [\"Los Angeles\", \"London\", \"London\"],\n", - " \"Lat1\": [35.6895, 35.6895, 34.0522],\n", - " \"Lon1\": [139.6917, 139.6917, -118.2437],\n", - " \"Lat2\": [34.0522, 51.5074, 51.5074],\n", - " \"Lon2\": [-118.2437, -0.1278, -0.1278],\n", - " }\n", - ")\n", - "\n", - "# Calculate distances between city pairs\n", - "city_pairs[\"Distance_km\"] = haversine_np(\n", - " city_pairs[\"Lat1\"], city_pairs[\"Lon1\"], city_pairs[\"Lat2\"], city_pairs[\"Lon2\"]\n", - ")\n", - "city_pairs" - ] - }, - { - "cell_type": "markdown", - "id": "cb8b3969", - "metadata": {}, - "source": [ - "Pandas can read and write data in various formats, such as CSV, Excel, and SQL databases. This makes it easy to load and save data from different sources. For example, you can read a CSV file into a Pandas DataFrame and then perform operations on the data.\n", - "\n", - "Let's read a CSV file from an HTTP URL into a Pandas DataFrame and display the first few rows of the data." - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "id": "1a857fd3", - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
idnamecountrylatitudelongitudepopulation
01BomboUGA0.583332.533375000
12Fort PortalUGA0.671030.275042670
23PotenzaITA40.642015.799069060
34CampobassoITA41.563014.656050762
45AostaITA45.73707.315034062
\n", - "
" - ], - "text/plain": [ - " id name country latitude longitude population\n", - "0 1 Bombo UGA 0.5833 32.5333 75000\n", - "1 2 Fort Portal UGA 0.6710 30.2750 42670\n", - "2 3 Potenza ITA 40.6420 15.7990 69060\n", - "3 4 Campobasso ITA 41.5630 14.6560 50762\n", - "4 5 Aosta ITA 45.7370 7.3150 34062" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "url = \"https://github.com/opengeos/datasets/releases/download/world/world_cities.csv\"\n", - "df = pd.read_csv(url)\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "3f549610", - "metadata": {}, - "source": [ - "The DataFrame contains information about world cities, including their names, countries, populations, and geographical coordinates. We can calculate the total population of all cities in the dataset using NumPy and Pandas as follows." - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "id": "c495ef04", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "1475534501" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "np.sum(df[\"population\"])" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.14" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/book/labs/lab3b copy.ipynb b/book/labs/lab3b copy.ipynb new file mode 100644 index 0000000..0d36d83 --- /dev/null +++ b/book/labs/lab3b copy.ipynb @@ -0,0 +1,1037 @@ +{ + "cells": [ + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# DOE Proposal\n", + "\n" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Import libraries\n", + "\n", + "Import the earthengine-api and geemap." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "c:\\Users\\C00553090\\AppData\\Local\\miniconda3\\envs\\hypercoast\\lib\\site-packages\\pandas\\core\\computation\\expressions.py:21: UserWarning: Pandas requires version '2.8.4' or newer of 'numexpr' (version '2.7.3' currently installed).\n", + " from pandas.core.computation.check import NUMEXPR_INSTALLED\n" + ] + } + ], + "source": [ + "import ee\n", + "import geemap\n", + "# import geemap.foliumap as geemap" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# This function is used to initialize the Google Earth Engine API, \n", + "# which is necessary before you start working with any Earth Engine data.\n", + "geemap.ee_initialize()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Download administrative boundaries\n", + "\n", + "Download the administrative boundaries of Rio Grande doSul, Brazil from [here](https://github.com/opengeos/datasets/releases/tag/places)." + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# roi\n", + "roi = ee.FeatureCollection(ee.Geometry.BBox(-82.6037, 24.4057, -79.3942, 27.4874))\n" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Longitude: -80.99894999999995, Latitude: 25.942776702401552\n" + ] + } + ], + "source": [ + "# geometry(): This method extracts the geometry from the roi object. \n", + "# In the context of Google Earth Engine, geometry represents the shapes (points, lines, polygons) that define the region.\n", + "# centroid(1): This method calculates the geometric center (centroid) of the region. \n", + "centroid = roi.geometry().centroid(1)\n", + "lon, lat = centroid.getInfo()[\"coordinates\"]\n", + "print(f\"Longitude: {lon}, Latitude: {lat}\")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create an interactive map\n", + "\n", + "Specify the center point `[lat, lon]` and zoom level of the map." + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "ae547ab9c95a4817af69ace0805bc291", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Map(center=[25.94277526766727, -80.99895], controls=(WidgetControl(options=['position', 'transparent_bg'], wid…" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# This line initializes a new interactive map using the geemap library. \n", + "# Map() is a class provided by geemap that creates an interactive map widget, which can be displayed within a Jupyter notebook or a Python environment supporting interactive widgets.\n", + "m = geemap.Map()\n", + "style = {\"fillColor\": \"00000000\", \"color\": \"FF0000\"}\n", + "# m.add_layer(...): This method adds the styled roi as a new layer to the map m. \n", + "# The empty dictionary {} could be used for additional layer-specific options if needed, and \"ROI\" is the name given to this layer, which will appear in the map's layer control.\n", + "m.add_layer(roi.style(**style), {}, \"ROI\")\n", + "# extract geometry of the roi object.\n", + "geom = roi.geometry()\n", + "m.center_object(geom, 6)\n", + "m" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "[-82.6037, 24.4057, -79.3942, 27.4874]" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#m.user_roi_coords()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the tutorial, we will focus on Rio Grande do Sul in Brazil, but the code can be easily modified to visualize and analyze floods in other countries. Modify the `place_name` variable to specify the place of interest and set the date range for the flood event. In order to extract the flood extent, we also need to specify the date range for the pre-flood period. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create Landsat composites\n", + "\n", + "Create a Landsat 8 composite for the pre-flood period (August 1 to September 30, 2021) using the [USGS Landsat 8 Collection 2 Tier 1 Raw Scenes](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1)." + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The number of images in the pre-flood collection: 380\n" + ] + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "1d3d0a9fd8c24cae8b986f8efb2ca169", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Map(center=[0, 0], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=SearchDataGUI(childr…" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "place_name = \"Everglades\"\n", + "start_date = \"2018-01-01\"\n", + "end_date = \"2020-12-30\"\n", + "\n", + "m = geemap.Map()\n", + "everglades = (\n", + " ee.ImageCollection(\"NASA/HLS/HLSL30/v002\")\n", + " .filterBounds(roi)\n", + " .filterDate(start_date, end_date)\n", + " .filter(ee.Filter.lt(\"CLOUD_COVERAGE\", 15))\n", + ")\n", + "print(\n", + " f\"The number of images in the pre-flood collection: {everglades.size().getInfo()}\"\n", + ")\n", + "\n", + "vis_params = {'min': 0, 'max': 0.4, 'bands': ['B4', 'B3', 'B2']}\n", + "\n", + "m.add_layer(everglades, vis_params, \"Sentinel-2\")\n", + "m" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "b39eb518921c4263865f025b075e8284", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Map(bottom=14355.999064118085, center=[24.76679385905525, -79.48717924557344], controls=(WidgetControl(options…" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# This line loads the NLCD 2019 dataset from the USGS (United States Geological Survey) as an Earth Engine Image object. \n", + "# The ee.Image function creates an image object from the specified data source, in this case, the 2019 NLCD.\n", + "nlcd = ee.Image('USGS/NLCD_RELEASES/2019_REL/NLCD/2019')\n", + "# This selects the land cover classification band from the NLCD image.\n", + "landcover = nlcd.select('landcover').clip(everglades.geometry())\n", + "m.add_layer(landcover, {}, 'NLCD Landcover')\n", + "m" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Visualize the Landsat 8 composite for the pre-flood and flood periods." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Compare Landsat composites side by side\n", + "\n", + "Compare the pre-flood and flood composites side by side." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "224793a8c86c41218307b60d62cf597e", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Map(center=[-29.77865021719509, -53.24534458841491], controls=(ZoomControl(options=['position', 'zoom_in_text'…" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "m = geemap.Map()\n", + "left_layer = geemap.ee_tile_layer(pre_flood_image, vis_params, \"Landsat Pre-flood\")\n", + "right_layer = geemap.ee_tile_layer(post_flood_image, vis_params, \"Landsat Post-flood\")\n", + "m.split_map(\n", + " left_layer,\n", + " right_layer,\n", + " left_label=\"Landsat Pre-flood\",\n", + " right_label=\"Landsat Post-flood\",\n", + ")\n", + "m.add_layer(roi.style(**style), {}, place_name)\n", + "m.center_object(roi, 6)\n", + "m" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Compute Normalized Difference Water Index (NDWI)\n", + "\n", + "The [Normalized Difference Water Index](https://en.wikipedia.org/wiki/Normalized_difference_water_index) (NDWI) is a commonly used index for detecting water bodies. It is calculated as follows:\n", + "\n", + "$$NDWI = \\frac{Green - NIR}{Green + NIR}$$\n", + "\n", + "where Green is the green band and NIR is the near-infrared band. The NDWI values range from -1 to 1. The NDWI values are usually thresholded to a positive number (e.g., 0.1-0.3) to identify water bodies.\n", + "\n", + "Landsat 8 imagery has [11 spectral bands](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1#bands). The Landsat 8 NDWI is calculated using the green (`B3`) and NIR (`B5`) bands.\n", + "\n", + "![](https://i.imgur.com/yuZthc6.png)" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "ndwi_pre = pre_flood_image.normalizedDifference([\"B3\", \"B5\"]).rename(\"NDWI\")\n", + "ndwi_post = post_flood_image.normalizedDifference([\"B3\", \"B5\"]).rename(\"NDWI\")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Compute the NDWI layers for the pre-flood and flood periods side by side." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "c2992f3b630e4a8599536119fa5496f0", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Map(center=[-29.77865021719509, -53.24534458841491], controls=(ZoomControl(options=['position', 'zoom_in_text'…" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "m = geemap.Map()\n", + "ndwi_vis = {\"min\": -1, \"max\": 1, \"palette\": \"ndwi\"}\n", + "left_layer = geemap.ee_tile_layer(ndwi_pre, ndwi_vis, \"NDWI Pre-flood\")\n", + "right_layer = geemap.ee_tile_layer(ndwi_post, ndwi_vis, \"NDWI Post-flood\")\n", + "m.split_map(\n", + " left_layer, right_layer, left_label=\"NDWI Pre-flood\", right_label=\"NDWI Post-flood\"\n", + ")\n", + "m.add_layer(roi.style(**style), {}, place_name)\n", + "m.center_object(roi, 6)\n", + "m" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Extract Landsat water extent\n", + "\n", + "To extract the water extent, we need to convert the NDWI images to binary images using a threshold value. The threshold value is usually set to 0.1 to 0.3. The smaller the threshold value, the more water bodies will be detected, which may increase the false positive rate. The larger the threshold value, the fewer water bodies will be detected, which may increase the false negative rate." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "threshold = 0.1\n", + "# .gt(threshold): This is a method applied to the NDWI data. gt stands for \"greater than\". \n", + "# This method checks each value in ndwi_pre to see if it is greater than the threshold (0.1 in this case).\n", + "water_pre = ndwi_pre.gt(threshold)\n", + "water_post = ndwi_post.gt(threshold)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Combine the pre-flood and surface water extent side by side." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "7e7cd1bdd55b4386b205d028ebe4a659", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Map(center=[-29.77865021719509, -53.24534458841491], controls=(ZoomControl(options=['position', 'zoom_in_text'…" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "m = geemap.Map()\n", + "\n", + "m.add_layer(pre_flood_image, vis_params, \"Landsat Pre-flood\", True)\n", + "m.add_layer(post_flood_image, vis_params, \"Landsat Post-flood\", True)\n", + "\n", + "left_layer = geemap.ee_tile_layer(\n", + " water_pre.selfMask(), {\"palette\": \"blue\"}, \"Water Pre-flood\"\n", + ")\n", + "right_layer = geemap.ee_tile_layer(\n", + " water_post.selfMask(), {\"palette\": \"yellow\"}, \"Water Post-flood\"\n", + ")\n", + "\n", + "m.split_map(\n", + " left_layer,\n", + " right_layer,\n", + " left_label=\"Water Pre-flood\",\n", + " right_label=\"Water Post-flood\",\n", + ")\n", + "m.add_layer(roi.style(**style), {}, place_name)\n", + "m.center_object(roi, 6)\n", + "m" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Extract Landsat flood extent\n", + "\n", + "To extract the flood extent, we need to subtract the pre-flood water extent from the flood water extent. The flood extent is the difference between the flood water extent and the pre-flood water extent. In other words, pixels identified as water in the flood period but not in the pre-flood period are considered as flooded pixels. The `selfMask()` method is used to mask out the no-data pixels." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "flood_extent = water_post.subtract(water_pre).gt(0).selfMask()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Add the flood extent layer to the map." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "m = geemap.Map()\n", + "\n", + "m.add_layer(pre_flood_image, vis_params, \"Landsat Pre-flood\", True)\n", + "m.add_layer(post_flood_image, vis_params, \"Landsat Post-flood\", True)\n", + "\n", + "left_layer = geemap.ee_tile_layer(\n", + " water_pre.selfMask(), {\"palette\": \"blue\"}, \"Water Pre-flood\"\n", + ")\n", + "right_layer = geemap.ee_tile_layer(\n", + " water_post.selfMask(), {\"palette\": \"yellow\"}, \"Water Post-flood\"\n", + ")\n", + "\n", + "m.split_map(\n", + " left_layer,\n", + " right_layer,\n", + " left_label=\"Water Pre-flood\",\n", + " right_label=\"Water Post-flood\",\n", + ")\n", + "\n", + "m.add_layer(flood_extent, {\"palette\": \"cyan\"}, \"Flood Extent\")\n", + "m.add_layer(roi.style(**style), {}, place_name)\n", + "m.center_object(roi, 6)\n", + "m" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Calculate Landsat flood area\n", + "\n", + "To calculate the flood area, we can use the [`geemap.zonal_stats()`](https://geemap.org/common/#geemap.common.zonal_stats) function. The required input parameters are the flood extent layer and the country boundary layer. The `scale` parameter can be set to `1000` to specify the spatial resolution of image to be used for calculating the zonal statistics. The `stats_type` parameter can be set to `SUM` to calculate the total area of the flood extent in square kilometers. Set `return_fc=True` to return the zonal statistics as an `ee.FeatureCollection` object, which can be converted to a Pandas dataframe." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "area_pre_flood = geemap.zonal_stats(\n", + " water_pre.selfMask(), roi, scale=1000, stat_type=\"SUM\", return_fc=True\n", + ")\n", + "geemap.ee_to_df(area_pre_flood)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "area_2022 = geemap.zonal_stats(\n", + " water_post.selfMask(), roi, scale=1000, stat_type=\"SUM\", return_fc=True\n", + ")\n", + "geemap.ee_to_df(area_2022)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "flood_area = geemap.zonal_stats(\n", + " flood_extent.selfMask(), roi, scale=1000, stat_type=\"SUM\", return_fc=True\n", + ")\n", + "geemap.ee_to_df(flood_area)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "hypercoast", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/book/lectures/lecture12.md b/book/lectures/lecture12.md new file mode 100644 index 0000000..359035b --- /dev/null +++ b/book/lectures/lecture12.md @@ -0,0 +1,3 @@ +## Lecture12 + + \ No newline at end of file