Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use centroids for GCM query #225

Conversation

lindsayplatt
Copy link
Contributor

Change GCM query to work with centroids + use nice geoknife helper, result() instead of download() to avoid a bunch of parsing work. Currently only working with one variable, but have in issue here to help resolve that.

@lindsayplatt
Copy link
Contributor Author

@hcorson-dosch here is the start of using centroids to query. This should help you do the next part, which is smartly splitting/grouping and using a constructed grid to query grid centroids, not lakes.

@jordansread
Copy link

Quick follow-up on the geoknife parsing issue. It seems the issue resides with just one of the variables you are using, which is soil moisture (mrso). That variable in funky because it has multiple "levels" which are causing some of the assumptions in the parser to fail as captured by Dave here.

gcm_job <- geoknife(
    stencil = simplegeom(query_pts),
    fabric = webdata(
        url = "https://cida.usgs.gov/thredds/dodsC/notaro_GFDL_1980_1999",
        variables = c("evspsbl", "hfss"),
        times = c('1999-01-01', '1999-01-15')
    ),
    wait = TRUE
)
result(gcm_job)
               DateTime         1         2             3         4         5 variable statistic
1   1999-01-01 00:00:00  5.40e-06  6.20e-06  8.000000e-06  1.20e-05  7.00e-06  evspsbl      MEAN
2   1999-01-01 01:00:00  3.90e-06  5.60e-06  7.300000e-06  1.25e-05  6.40e-06  evspsbl      MEAN
3   1999-01-01 02:00:00  3.20e-06  7.40e-06  5.700000e-06  6.80e-06  5.70e-06  evspsbl      MEAN
4   1999-01-01 03:00:00  8.30e-06  6.00e-06  4.200000e-06  1.80e-06  4.10e-06  evspsbl      MEAN
5   1999-01-01 04:00:00  5.10e-06 -1.00e-07  3.100000e-06  3.70e-06  3.00e-06  evspsbl      MEAN
6   1999-01-01 05:00:00  0.00e+00  1.90e-06  3.000000e-06  3.00e-06 -5.00e-07  evspsbl      MEAN
...

works, even though you are using more than one variable in this case. Since mrso isn't a variable used by our models, you can skip that. (We should also touch base on the variables necessary for the driver data, since not all variables in the Notaro dataset are used, they are here but we renamed those for readability back then...).

_targets.R Outdated
query_polys_sf,
centroids_to_poly(query_centroids_sf)
),

# Convert the sf polygons into a geoknife-ready format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looks like you missed updating this comment here to refer to the centroids, rather than the lake polygons

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then again we'll be tweaking this to instead use the grid cell centroids anyhow, so will change shortly

@@ -24,29 +25,24 @@ targets_list <- list(
split_lake_centroids(centroids_sf_rds)
),

# Convert the lake centroids to some kind of polygon for querying GDP
tar_target(
query_polys_sf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this target is still used to build our test query_map_png below, FYI

Copy link
Contributor

@hcorson-dosch-usgs hcorson-dosch-usgs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me, Lindsay. Wasn't sure if you wanted to test w/ mrso dropped to see if the result() parser works w/ multiple variables if that one isn't included, before we merge this in?

@lindsayplatt
Copy link
Contributor Author

@hcorson-dosch made those changes and added hfss back in - it does work!

library(targets)
tar_make(gcm_data_raw_feather)
tar_load(gcm_data_raw_feather)
x <- arrow::read_feather(gcm_data_raw_feather)
unique(x$variable)

[1] "evspsbl" "hfss"  

@lindsayplatt lindsayplatt merged commit db2770e into DOI-USGS:gcm_driver_data_munge_pipeline Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants