Add the down_scale function to pecan. #3211

JoshuaPloshay · 2023-08-10T23:19:31Z

Description

Motivation and Context

Review Time Estimate

Immediately
Within one week
When possible

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation.
My name is in the list of CITATION.cff
I have updated the CHANGELOG.md.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.

mdietze · 2023-08-11T12:19:43Z

modules/assim.sequential/R/downscale_function.R

+##' @author Joshua Ploshay
+##'
+##' @param data  In quotes, file path for .rds containing ensemble data.
+##' @param focus_year In quotes, if SDA site run, format is yyyy/mm/dd, if NEON, yyyy-mm-dd. Restricted to years within file supplied to 'data'.


Why not have this passed as a Data rather than a string? Also, if you have to provide month and day (which makes sense given potential future applications) then why is the variable called focus_year instead of focus_date or just date? Also, if you are working with annual data do you need to know the exact date in the product or just the year (e.g. 2020-01-01 vs 2020-07-31)

mdietze · 2023-08-11T12:21:30Z

modules/assim.sequential/R/downscale_function.R

+##' @param data  In quotes, file path for .rds containing ensemble data.
+##' @param focus_year In quotes, if SDA site run, format is yyyy/mm/dd, if NEON, yyyy-mm-dd. Restricted to years within file supplied to 'data'.
+##' @param C_pool In quotes, carbon pool of interest. Name must match carbon pool name found within file supplied to 'data'.
+##' @param covariates: In quotes, file path of SpatRaster stack, used as predictors in randomForest. Layers within stack should be named.


Where are the scripts for downloading these layers and processing them into a stack? Those should be part of the repo too (e.g. in the inst folder), and thus part of the PR, and your function documentation should point users to them.

Also, if you organized all the covariates into a single stack, why not have the user pass in the stack rather than a file path?

modules/assim.sequential/R/downscale_function.R

mdietze · 2023-08-11T12:26:25Z

modules/assim.sequential/R/downscale_function.R

+##' @return It returns the `downscale_output` list containing lists for the training and testing data sets, models, and predicted maps for each ensemble member.
+
+
+NA_downscale <- function(data, cords, covariates, focus_year, C_pool){


generally best to document variables in the order they are used in the function

mdietze · 2023-08-11T15:08:04Z

modules/assim.sequential/R/downscale_function.R

+
+  # Extract the carbon data for the specified focus year
+  index <- which(names(input_data) == focus_year)
+  data <- input_data[[index]]


poor choice of variable names. You earlier defined data as the file path to the SDA object. Now, you're redefining it as a temporal subset of that SDA object. One of these two uses needs to change.

mdietze · 2023-08-11T15:23:42Z

modules/assim.sequential/R/downscale_function.R

+  # Rename the training and testing data frames for each ensemble member
+  for (i in 1:length(ensembles)) {
+    # names(ensembles) <- paste0("ensemble",seq(1:length(ensembles)))
+    names(ensembles[[i]]) <- c("training", "testing")


Seems like you should have been able to do this during the split rather than after.

mdietze · 2023-08-11T15:27:32Z

modules/assim.sequential/R/downscale_function.R

+  output <- list()
+  for (i in 1:length(ensembles)) {
+    output[[i]] <- randomForest::randomForest(ensembles[[i]][[1]][["carbon_data"]] ~ land_cover+tavg+prec+srad+vapr+nitrogen+phh2o+soc+sand,
+                                data = ensembles[[i]][[1]],


In this line and above, what is the second dimension of ensembles and why is it being hard coded to [[1]]? Could you reference this by name instead (e.g. ensembles[[i]][["train"]])? Also, if you are specifying the data argument, why is the y in the randomForest model ensembles[[i]][[1]][["carbon_data"]] instead of just carbon_data

mdietze · 2023-08-11T15:29:19Z

modules/assim.sequential/R/downscale_function.R

+                                importance = T)
+  }
+
+  # Generate predictions (maps) for each ensemble member using the trained models


Seems like using the 'test' data to validate the models should come before making spatial predictions

mdietze · 2023-08-11T15:30:09Z

modules/assim.sequential/R/downscale_function.R

+
+  # Train a random forest model for each ensemble member using the training data
+  output <- list()
+  for (i in 1:length(ensembles)) {


Depending on how long this step takes, you might consider doing this step in parallel

mdietze · 2023-08-11T15:30:27Z

modules/assim.sequential/R/downscale_function.R

+
+  # Generate predictions (maps) for each ensemble member using the trained models
+  maps <- list(ncol(output))
+  for (i in 1:length(output)) {


Depending on how long this step takes, you might consider doing this step in parallel

mdietze · 2023-08-11T16:54:18Z

Build is failing because of the following:

R check of modules/assim.sequential reports the following new problems. Please fix these and resubmit:
  checking dependencies in R code ... WARNING
  '::' or ':::' imports not declared from:
    ‘randomForest’ ‘readr’ ‘terra’

I think you should modify the code to eliminate the readr dependency and then add randomForest and terra under "Suggests" in the assim.sequential description file. You will probably also need to run scripts/generate_dependencies.R and commit the updated docker/depends/pecan.depends.R

mdietze · 2023-09-03T01:14:24Z

@JoshuaPloshay just wanted to re-ping you on this before your semester gets busy

Merge branch 'develop' of https://github.com/PecanProject/pecan into develop # Conflicts: # docker/depends/pecan.depends.R

… develop

modules/assim.sequential/R/downscale_function.R

modules/assim.sequential/man/NA_downscale.Rd

sambhavnoobcoder · 2024-05-21T11:50:04Z

hi @JoshuaPloshay , I was looking into this pr and was wondering if you could share where and how the data and cords variables are being generated or fetched from ? I wanted to run the downscale_function.R but can't understand the source of the variables and parameters used in the function .

mdietze · 2024-05-21T12:40:10Z

@sambhavnoobcoder as we discussed, the covariates come from the covariates.R file, which is in the open PR #3272 What I'd asked you to do wasn't to post a vague comment here, but to go to the open PR and comment on the specific lines in covariates.R that are unclear. A number of the covariates are already well documented there and it's not fair to ask open ended questions here without first reading that code. We've also discussed from the very start of the project discussion that the initial test data should be the output from the Dokoohaki et al 2022 paper, which are posted on OSF: https://osf.io/efcv5/

Add the down_scale function to pecan.

0838e95

mdietze requested changes Aug 11, 2023

View reviewed changes

mdietze mentioned this pull request Nov 1, 2023

removed rgdal from data.land and data.remote #3229

Merged

13 tasks

mdietze and others added 7 commits February 22, 2024 08:23

Merge branch 'develop' into develop

aa5a417

Merge branch 'PecanProject:develop' into develop

93afde7

updated documentation to include suggested packages

867b88a

Merge branch 'PecanProject:develop' into develop

abfd7c1

merge from develop branch

55a8f66

Merge branch 'develop' of https://github.com/PecanProject/pecan into develop # Conflicts: # docker/depends/pecan.depends.R

resolve conflicts and update dependencies

5ae3036

Merge branch 'develop' of https://github.com/JoshuaPloshay/pecan into…

391fd05

… develop

mdietze reviewed Mar 5, 2024

View reviewed changes

modules/assim.sequential/R/downscale_function.R Outdated Show resolved Hide resolved

mdietze added 3 commits March 5, 2024 09:35

Update modules/assim.sequential/R/downscale_function.R

fe0aefe

Update modules/assim.sequential/R/downscale_function.R

2e3d832

Merge branch 'develop' into develop

b68a95c

mdietze reviewed Mar 7, 2024

View reviewed changes

modules/assim.sequential/R/downscale_function.R Outdated Show resolved Hide resolved

modules/assim.sequential/man/NA_downscale.Rd Outdated Show resolved Hide resolved

modules/assim.sequential/man/NA_downscale.Rd Outdated Show resolved Hide resolved

mdietze added 4 commits March 7, 2024 11:07

Update modules/assim.sequential/R/downscale_function.R

1daf41b

Update modules/assim.sequential/man/NA_downscale.Rd

4d032e8

Update modules/assim.sequential/man/NA_downscale.Rd

e98c461

Merge branch 'develop' into develop

beb21fd

mdietze approved these changes Mar 7, 2024

View reviewed changes

mdietze added this pull request to the merge queue Mar 7, 2024

Merged via the queue into PecanProject:develop with commit 4564337 Mar 7, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the down_scale function to pecan. #3211

Add the down_scale function to pecan. #3211

JoshuaPloshay commented Aug 10, 2023 •

edited

Loading

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze Aug 11, 2023

mdietze commented Aug 11, 2023

mdietze commented Sep 3, 2023

sambhavnoobcoder commented May 21, 2024

mdietze commented May 21, 2024

		##' @return It returns the `downscale_output` list containing lists for the training and testing data sets, models, and predicted maps for each ensemble member.


		NA_downscale <- function(data, cords, covariates, focus_year, C_pool){

Add the down_scale function to pecan. #3211

Add the down_scale function to pecan. #3211

Conversation

JoshuaPloshay commented Aug 10, 2023 • edited Loading

Description

Motivation and Context

Review Time Estimate

Types of changes

Checklist:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdietze commented Aug 11, 2023

mdietze commented Sep 3, 2023

sambhavnoobcoder commented May 21, 2024

mdietze commented May 21, 2024

JoshuaPloshay commented Aug 10, 2023 •

edited

Loading