Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inquiry functions to return nearest model data to a (lat,lon) location; Update benchmark model vs. obs scripts; Add models vs sondes output #277

Merged
merged 17 commits into from
Feb 9, 2024

Conversation

yantosca
Copy link
Contributor

@yantosca yantosca commented Dec 5, 2023

Name and Institution (Required)

Name: Bob Yantsoca
Institution: Harvard + GCST

Confirm you have reviewed the following documentation

Describe the update

This PR does the following:

  1. Adds functionality to return data (as pandas.DataFrame object) at a given (lat,lon) location, for both lat-lon and cubed sphere grids. These new functions have been added to `gcpy.grid:

    • get_nearest_model_data_cs: Returns nearest model data (cubed-sphere grid) to a given (lat,lon) location
    • get_nearest_model_data_ll: Returns nearest model data (lat-lon grid) to a given (lat,lon) location
    • get_nearest_model_data: Calls either get_nearest_model_data_cs or get_nearest_model_data_ll depending on the type of grid
  2. Added fixes so that cubed-sphere inquiry functions in cstools.py work properly when passed an xarray.DataArray object instead of an xarray.Dataset object.

  3. Simplified the logic in cubed-sphere inquiry functions in cstools.py.

  4. Fixed a bug in gcpy.examples.plotting.plot_single_panel where the wrong variables were used. This was a cut-n-paste error.

  5. Renamed gcpy.benchmark.modules.benchmark_models_vs_obs.py to benchmark.modules.benchmark_models_vs_ebas_o3.py

  6. Added gcpy.benchmark.modules.benchmark.models_vs_sondes.py

  7. Updated gcpy.benchmark.modules.benchmark_models_vs_ebas_o3.py and gcpy.benchmark.modules.benchmark_models_vs_sondes.py to use the new get_nearest_model_data function to obtain GEOS-Chem model data closest to a observation site (lat, lon, alt) location.

  8. Updated routine get_vert_grid in gcpy/grid.py so that an optional surface pressure can be passed. This allows us to generate a GEOS-Chem vertical grid starting at a different surface pressure than the default 1013.25 hPa. This functionality is used by the models vs. sondes plot.

Expected changes

This is a zero-diff update. Here is the first page of an example plot that was made with these modifications (GCC 14.2.0 vs GCHP 14.2.0):

models_vs_obs

And here is the same plot from the 14.2.0 benchmark output:

14 2 0

Also here is the model vs. sondes plots:

Reference(s)

N/A

Related Github Issue(s)

grid.py
- Add routines:
  - get_nearest_model_data_cs
  - get_nearest_model_data_ll
  - get_nearest_model_data
- Trimmed trailing whitespace
- Changed imports from e.g. "import .util" to "import gcpy.util"
- Now import find_index, is_cubed_sphere from gcpy.cstools

Signed-off-by: Bob Yantosca <[email protected]>
This merge brings the feature/update-model-vs-obs branch up-to-date
with the latest updates in dev.

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/cstools.py
- In routine "is_cubed_sphere_rst_grid":
  - Check if len(data.coords["lat"]) == len(data.coords["lon"]) * 6,
    instead of checking data.dims.  In Dataset objects, data.dims
    is a dict w/ dimension names and sizes, but in DataArray objects,
    dims is just a tuple of names.
  - Search for "SPC_" in data.name for DataArray objects, which do
    not have the data_vars dictionary
- In routine "get_cubed_sphere_res":
  - Return len(data.coords["lon"]) instead of data.dims["lon"] and
    len(data.coords["Xdim"] instead of data.dims["Xdim"].  This will
    work if data is either a Dataset or DataArray object.

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/cstools.py
- Reduce the number of return statements by returning a boolean
  expression.  This is more in line with what Pylint would suggest.

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/examples/plotting/plot_single_panel.py
- Fixed a "cut-and-paste" error.  Removed calls to routine
  rename_and_flip_gchp_rst_vars for ref_ds and dev_ds, and added
  call for dset (which is correct).

Signed-off-by: Bob Yantosca <[email protected]>
This merge brings all of the updates from the GCPy 1.4.0 release
into the feature/update-model-vs-obs branch.  In this branch, we have
been refactoring the benchmark_model_vs_obs.py script to use utility
functions that can select the nearest grid box (either cubed-sphere
or lat-lon) to a given lat-lon location.

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/benchmark/modules/benchmark_models_vs_obs.py
- Remove imports for find_index, is_cubed_sphere functions
- Add import for get_nearest_model_data from gcpy.grid
- Updated Pydoc headers, make them more compact
- Replace functions get_nearest_model_data_to_obs_cs and
  get_nearest_model_data_to_obs_ll with get_nearest_model_data_to_obs.
  This uses the common functiong get_nearest_model_data from grid.py
- Rename gc_level_alts_m to gc_levels; also add "Altitude (m)" column
- Remove function "which_finder_function", it's not needed
- Remove **kwargs argument from calls to "prepare_data_for_plot"
  and "call_single_station"
- Update label of observations to "Surface O3 (EBAS, 2019)" for clarity
- Add code updates suggested by Pylint

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca added category: Feature Request New feature or request topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output topic: Structural Modifications Related to GCPy structural modifications (as opposed to scientific updates) topic: Utilities Related to utility functions & convenience routines labels Dec 5, 2023
@yantosca yantosca added this to the 1.5.0 milestone Dec 5, 2023
@yantosca yantosca self-assigned this Dec 5, 2023
Copy link

github-actions bot commented Feb 3, 2024

Stale pull request message

This merge brings the "dev" branch (as of the merge with PR #295)
into the feature/update-model-vs-obs (which far behind).

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/benchmark/modules/benchmark_utils.py
- Moved get_geoschem_level_metadata here
- Also need to import pandas as pd

gcpy/benchmark/modules/benchmark_model_vs_obs.py
- Now import get_geoschem_level_metadata from benchmark_utils

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/benchmark/modules/benchmark_models_vs_ozonesondes.py
- New script to plot model vs ozonesonde output for 1-yr benchmarks.
  Further work is still needed and will be done in subsequent commits.

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/benchmark/modules/benchmark_models_vs_ozonesondes.py
- Now import additional variables & functions from gcpy.grid,
  so that we can eventually compute the p_mid coordinate from
  the surface pressure at the observation site.
- Add optional "Collection" kwarg to get_ref_and_dev_model_data
- Trimmed trailing whitespace
- Updated comments

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/grid.py
- In routine get_vert_grid.py:
  - Add kwarg p_sfc with default value 1013.25 hPa so that we can
    generate the vertical grid edges and centers with reference to
    a given surface pressure
  - Updated PyDoc header
  - Never-nested the if-block logic to remove elif, else statements

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
…des.py

gcpy/benchmark/modules/benchmark_models_vs_ozonesondes.py
- Moved to benchmark_models_vs_sondes.py

gcpy/benchmark/modules/benchmark_models_vs_sondes.py
- Moved from benchmark_models_vs_ozonesondes.py
- Added make_benchmark_models_vs_sondes_plots routines
- Now use the updated get_vert_grid from gcpy/grid.py
- Now read a metadata file with the surface pressure at each of the
  observation stations, so that the model & data grid can start at the
  same surface pressure
- Renamed variables for consistency

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/benchmark/benchmark_slurm.sh
- Now save output to a log file with the same base name as the YAML
  configuration file

gcpy/benchmark/modules/benchmark_models_vs_obs.py
- Renamed to benchmark_models_vs_ebas.py

gcpy/benchmark/modules/benchmark_models_vs_ebas_o3.py
- Renamed from benchmark_models_vs_obs.py
- Also renamed driver routine to make_benchmark_vs_ebas_obs_plots

gcpy/benchmark/modules/benchmark_models_vs_sondes.py
- Removed the main function (this is for debugging only)

gcpy/benchmark/modules/run_1yr_fullchem_benchmark.py
- Now call make_benchmark_1yr_models_vs_ebas_o3 routine
- Now call make_benchmark_models_vs_sondes_plots routine

gcpy/benchmark/config/1yr_fullchem_benchmark.yml
- Expanded obs_data section to include data directories & file names
  for EBAS and sonde observations
- Removed extraneous comments

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
gcpy/benchmark/config/1yr_fullchem_benchmark.yml
- Fix error in ebas_o3:data_dir YAML tag

gcpy/benchmark/modules/benchmark_models_vs_ebas_o3.py
- Remove **kwargs from plot_models_vs_obs function call
- Restore the matplotlib plot style to "default" after plotting finishes,
  otherwise the plot style will be applied to other plots that follow

gcpy/benchmark/models_vs_sondes.py
- Add function sort_sites_by_lat to return a unique list of site names
  sorted by latitude from N to S

gcpy/benchmark/modules/run_1yr_fullchem_benchmark.py
- Fix error in sonde data file paths (needed to add ["obs_data"])
- In GCHP vs GCC model vs sondes plots, make sure to
  use config["data"]["dev"]["gchp"]["version"] as the dev label
- In GCHP vs GCHP model vs EBAS O3 plots, make sure to use
  config["data"]["ref"]["gchp"]["version"] as the ref label
- In GCHP vs GCHP model vs sondes plots, make sure to use
  config["data"]["ref"]["gchp"]["version"] and
  config["data"]["dev"]["gchp"]["version"] as ref & dev labels

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca changed the title Add inquiry functions to return nearest model data to a (lat,lon) location; Update benchmark model vs. obs script accordingly Add inquiry functions to return nearest model data to a (lat,lon) location; Update benchmark model vs. obs scripts; Add models vs sondes output Feb 8, 2024
@yantosca yantosca requested review from lizziel and removed request for msulprizio February 8, 2024 19:25
CHANGELOG.md
- Removed ">>>>>>> dev", this was left over from a git merge

Signed-off-by: Bob Yantosca <[email protected]>
@@ -727,7 +562,7 @@ def plot_single_station(
marker='^',
markersize=4,
lw=1,
label='Observations'
label='Surface O3 (EBAS, 2019)'
Copy link
Contributor

@lizziel lizziel Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a noteworthy change since the plots look different (different legend). How frequently will this source change? Do you think it should be set in the config file rather than in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lizziel, we could set it in the config file. The only thing is that the code that we received is fairly specific to the type of data. Although all we would have to do to make it more general is to add another reading routine for a different type of data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now resolved in 567342f

Copy link
Contributor

@lizziel lizziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. My only comments are about the switch from generic observation to specifying the surface ozone source. I wonder if this limits its usability for comparing to other surface ozone sources?

# Plot models vs. observations (O3 for now)
mvo.make_benchmark_models_vs_obs_plots(
config["paths"]["obs_data_dir"],
# Plot models vs. EBAS O3 observations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is we had surface ozone from a different source would it be hard to change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lizziel: It depends on how different the data is from the data that we have. I can change the name to to benchmark_model_vs_sfc_obs.py to make it more general, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lizziel, As of commit 567342f, we now specify the model vs. obs label in the config file.

paths:
  main_dir: /n/holyscratch01/jacob_lab/ryantosca/BM/1yr
  results_dir: /n/holyscratch01/jacob_lab/ryantosca/BM/1yr/BenchmarkResults
  weights_dir: /n/holyscratch01/external_repos/GEOS-CHEM/gcgrid/data/ExtData/GCHP/RegriddingWeights
  spcdb_dir: default
  #
  # Observational data dirs are on Harvard Cannon, edit if necessary
  #
  obs_data:
    ebas_o3:
      data_dir: /n/jacob_lab/Lab/obs_data_for_bmk/ebas_sfc_o3_2019
      data_label: "O3 (EBAS, 2019)"
    sondes:
      data_dir: /n/jacob_lab/Lab/obs_data_for_bmk/sondes_2010-2019
      data_file: allozonesondes_2010-2019.csv
      site_file: allozonesondes_site_elev.csv

@yantosca yantosca requested a review from lizziel February 8, 2024 21:05
gcpy/benchmark/config/1yr_fullchem_benchmark.yml
- Add obs_data.ebas_o3.data_label YAML tag.  This specifies the top-of-page
  plot title in the model vs. obs plots.

gcpy/benchmark/modules/benchmark_models_vs_ebas_o3.py
- Renamed to benchmark_models_vs_obs.py

gcpy/benchmark/modules/benchmark_models_vs_obs.py
- Renamed from benchmark_models_vs_ebas_o3.py, in order to make this
  script more generally applicable.
- The main driver routine is now make_benchmark_models_vs_obs_plots
- Now pass obs_data_label to relevant routines for the top-of-page
  plot titles
- Update Pydoc header style to be less verbose

gcpy/benchmark/modules/benchmark_models_vs_sondes.py
- Updated Pydoc comments for consistency

gcpy/benchmark/modules/run_1yr_fullchem_benchmark.py
- Now call make_benchmark_models_vs_obs_plots
- Pass the config["obs_data"]["ebas_o3"]["data_label"] to
  make_benchmark_models_vs_obs_plots
- Remove references to EBAS O3 in comments

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
Copy link
Contributor

@lizziel lizziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge.

Copy link
Contributor Author

yantosca commented Feb 9, 2024

Thanks @lizziel!

@yantosca yantosca merged commit 811760d into dev Feb 9, 2024
14 of 15 checks passed
@yantosca yantosca deleted the feature/update-model-vs-obs branch February 12, 2024 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Feature Request New feature or request topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output topic: Structural Modifications Related to GCPy structural modifications (as opposed to scientific updates) topic: Utilities Related to utility functions & convenience routines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] Add tool for quickly obtaining I,J indices when given lat,lon values
2 participants