Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce path and deprecate hub_con and config_tasks as args to submission_tmpl #196

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
d950730
Deprecate hub_con and config_tasks arg in submission_tmpl in favour o…
annakrystalli Jan 29, 2025
8155a2d
Update tests to reflect deprecation of submission_tmpl args
annakrystalli Jan 29, 2025
cff0bfc
Reorganise examples
annakrystalli Jan 29, 2025
41ce2da
Update NEWS
annakrystalli Jan 29, 2025
f0f90a8
Appease lintr
annakrystalli Jan 29, 2025
bc4c048
Change hub_path arg to path and allow for path to config file to be p…
annakrystalli Jan 31, 2025
ef53dc9
Update submission_tmpl examples
annakrystalli Jan 31, 2025
0727235
Remove mention of hub connection "config_tasks" attribute from modell…
annakrystalli Jan 31, 2025
39828ba
Document
annakrystalli Jan 31, 2025
3c7d9e8
Reorganise tests, add more specific tests.
annakrystalli Jan 31, 2025
68d4e41
remove duplicate check on whether path file/directory exists
annakrystalli Jan 31, 2025
8f3e750
Appease the lintr!
annakrystalli Jan 31, 2025
4467ee6
Update NEWS
annakrystalli Feb 4, 2025
338b92b
Add caller environment to switch_get_config error msg
annakrystalli Feb 4, 2025
c35227d
Add comment
annakrystalli Feb 4, 2025
f23c09f
Add support for URLs and S3 SubTreeFleSystem objects as inputs to the…
annakrystalli Feb 6, 2025
68bae8c
Fix example error
annakrystalli Feb 6, 2025
2a8c3a8
Update NEWS
annakrystalli Feb 10, 2025
0aa7e99
Simplify hub vs file switch while catching invalid GitHub URLs
annakrystalli Feb 10, 2025
461cc5d
Update check to point to an existing file that is interpretted as folder
annakrystalli Feb 10, 2025
05b61e6
Add TODO note in test
annakrystalli Feb 10, 2025
8805841
typos
annakrystalli Feb 10, 2025
1295c9c
test commit to test verification
annakrystalli Feb 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Imports:
gh,
hubAdmin (>= 1.4.0),
hubData (>= 1.3.0),
hubUtils (>= 0.3.0),
hubUtils (>= 0.4.0.9000),
jsonlite,
jsonvalidate,
lifecycle,
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -85,5 +85,6 @@ importFrom(hubUtils,get_hub_model_output_dir)
importFrom(hubUtils,get_hub_timezone)
importFrom(hubUtils,read_config)
importFrom(hubUtils,read_config_file)
importFrom(lifecycle,deprecated)
importFrom(lubridate,"%within%")
importFrom(rlang,"!!!")
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# hubValidations (development version)

* Introduced `path` as main argument to `submission_tmpl()` and deprecated arguments `hub_con` and `config_tasks` (#165 & #137). This way, all that is required by the user to create a submission template is the path to a hub directory or `tasks.json` config file. We also added functionality to enable sourcing config files from a
cloud hub or directly from GitHub which means users do not required a local copy of the hub to create a submission template.
annakrystalli marked this conversation as resolved.
Show resolved Hide resolved
* `submission_tmpl()` gains `path` as the main argument, which can take a path to a
local hub or config file, an s3 connection, or a URL to a hub or tasks configuration file. This allows users to create submission templates without downloading a local copy of the entire hub. The arguments `hub_con` and `config_tasks` are deprecated and will be removed in future versions (#165 & #137).

# hubValidations 0.10.1

* `check_tbl_value_col_ascending()` will now use the order of the
Expand Down
196 changes: 143 additions & 53 deletions R/submission_tmpl.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,21 @@
#' Create a model output submission file template
#'
#' @param hub_con A `⁠<hub_connection`>⁠ class object.
#' @param path Character string. Can be one of:
#' - a path to a local fully configured hub directory
#' - a path to a local `tasks.json` file.
#' - a URL to the repository of a fully configured hub on GitHub.
#' - a URL to the **raw contents** of a `tasks.json` file on GitHub.
#' - a `<SubTreeFileSystem>` class object pointing to the root of an S3 cloud hub.
#' - a `<SubTreeFileSystem>` class object pointing to a `tasks.json` config file in
#' an S3 cloud hub, relative to the hub's root directory.
#'
#' See examples for more details.
#' @param hub_con `r lifecycle::badge("deprecated")` Use `path` instead. A
#' `⁠<hub_connection>⁠` class object.
#' @param config_tasks `r lifecycle::badge("deprecated")` Use `path` instead.
#' A list version of the content's of a hub's `tasks.json` config file,
#' accessed through the `"config_tasks"` attribute of a `<hub_connection>`
#' object or function [read_config()].
#' @inheritParams expand_model_out_grid
#' @param derived_task_ids Character vector of derived task ID names (task IDs whose
#' values depend on other task IDs) to ignore. Columns for such task ids will
Expand Down Expand Up @@ -49,107 +64,128 @@
#' specified in `round_id` property of `config_tasks`) is set to the value of the
#' `round_id` argument in the returned output.
#' @export
#' @importFrom lifecycle deprecated
#'
#' @examples
#' hub_con <- hubData::connect_hub(
#' system.file("testhubs/flusight", package = "hubUtils")
#' )
#' submission_tmpl(hub_con, round_id = "2023-01-02")
#' hub_path <- system.file("testhubs/flusight", package = "hubUtils")
#' submission_tmpl(hub_path, round_id = "2023-01-02")
#' # Return required values only
#' submission_tmpl(
#' hub_con,
#' hub_path,
#' round_id = "2023-01-02",
#' required_vals_only = TRUE
#' )
#' submission_tmpl(
#' hub_con,
#' hub_path,
#' round_id = "2023-01-02",
#' required_vals_only = TRUE,
#' complete_cases_only = FALSE
#' )
#' # Specifying a round in a hub with multiple rounds
#' hub_con <- hubData::connect_hub(
#' system.file("testhubs/simple", package = "hubUtils")
#' )
#' submission_tmpl(hub_con, round_id = "2022-10-01")
#' submission_tmpl(hub_con, round_id = "2022-10-29")
#' submission_tmpl(hub_con,
#' round_id = "2022-10-29",
#' required_vals_only = TRUE
#' )
#' submission_tmpl(hub_con,
#' round_id = "2022-10-29",
#' required_vals_only = TRUE,
#' complete_cases_only = FALSE
#' # Specify a round in a hub with multiple rounds
#' hub_path <- system.file("testhubs/simple", package = "hubUtils")
#' submission_tmpl(hub_path, round_id = "2022-10-01")
#' submission_tmpl(hub_path, round_id = "2022-10-29")
#' # Subset for a specific output type
#' hub_path <- system.file("testhubs", "samples", package = "hubValidations")
#' submission_tmpl(
#' hub_path,
#' round_id = "2022-12-17",
#' output_types = "sample"
#' )
#' # Hub with sample output type
#' config_tasks <- read_config_file(system.file("config", "tasks.json",
#' # Create a template from the path to a tasks config file
#' config_path <- system.file("config", "tasks.json",
#' package = "hubValidations"
#' ))
#' )
#' submission_tmpl(
#' config_tasks = config_tasks,
#' config_path,
#' round_id = "2022-12-26"
#' )
#' # Hub with sample output type and compound task ID structure
#' config_tasks <- read_config_file(system.file("config", "tasks-comp-tid.json",
#' config_path <- system.file("config", "tasks-comp-tid.json",
#' package = "hubValidations"
#' ))
#' )
#' submission_tmpl(
#' config_tasks = config_tasks,
#' round_id = "2022-12-26"
#' config_path,
#' round_id = "2022-12-26",
#' output_types = "sample"
#' )
#' # Override config compound task ID set
#' # Create coarser compound task ID set for the first modeling task which contains
#' # samples
#' submission_tmpl(
#' config_tasks = config_tasks,
#' config_path,
#' round_id = "2022-12-26",
#' output_types = "sample",
#' compound_taskid_set = list(
#' c("forecast_date", "target"),
#' NULL
#' )
#' )
#' # Subsetting for a single output type
#' submission_tmpl(
#' config_tasks = config_tasks,
#' round_id = "2022-12-26",
#' output_types = "sample"
#' )
#' # Derive a template with ignored derived task ID. Useful to avoid creating
#' # a template with invalid derived task ID value combinations.
#' config_tasks <- read_config(
#' system.file("testhubs", "flusight", package = "hubValidations")
#' )
#' hub_path <- system.file("testhubs", "flusight", package = "hubValidations")
#' submission_tmpl(
#' config_tasks = config_tasks,
#' hub_path,
#' round_id = "2022-12-12",
#' output_types = "pmf",
#' derived_task_ids = "target_end_date",
#' complete_cases_only = FALSE
#' )
#' # Force optional output type, in this case "mean".
#' submission_tmpl(
#' config_tasks = config_tasks,
#' hub_path,
#' round_id = "2022-12-12",
#' required_vals_only = TRUE,
#' output_types = c("pmf", "quantile", "mean"),
#' force_output_types = TRUE,
#' derived_task_ids = "target_end_date",
#' complete_cases_only = FALSE
#' )
submission_tmpl <- function(hub_con, config_tasks, round_id,
#' # Create a template from a URL to fully configured hub repository on GitHub
#' submission_tmpl(
#' path = "https://github.com/hubverse-org/example-simple-forecast-hub",
#' round_id = "2022-11-28",
#' output_types = "quantile"
#' )
#' # Create a template from a URL to the raw contents of a tasks.json file on
#' # GitHub
#' config_raw_url <- paste0(
#' "https://raw.githubusercontent.com/hubverse-org/",
#' "example-simple-forecast-hub/refs/heads/main/hub-config/tasks.json"
#' )
#' submission_tmpl(
#' path = config_raw_url,
#' round_id = "2022-11-28",
#' output_types = "quantile"
#' )
#' @examplesIf asNamespace("hubUtils")$not_rcmd_check() && requireNamespace("arrow", quietly = TRUE)
#' # Create submission file using config file from AWS S3 bucket hub
#' # Use `s3_bucket()` to create a path to the hub's root directory
#' s3_hub_path <- arrow::s3_bucket("hubverse/hubutils/testhubs/simple/")
#' submission_tmpl(
#' path = s3_hub_path,
#' round_id = "2022-10-01",
#' output_types = "quantile"
#' )
#' # Use `path()` method to create a path to the tasks.json file relative to the
#' # the S3 cloud hub's root directory
#' s3_config_path <- s3_hub_path$path("hub-config/tasks.json")
#' submission_tmpl(
#' path = s3_config_path,
#' round_id = "2022-10-01",
#' output_types = "quantile"
#' )
submission_tmpl <- function(path, round_id,
required_vals_only = FALSE,
force_output_types = FALSE,
complete_cases_only = TRUE,
compound_taskid_set = NULL,
output_types = NULL,
derived_task_ids = NULL) {
switch(rlang::check_exclusive(hub_con, config_tasks),
hub_con = {
checkmate::assert_class(hub_con, classes = "hub_connection")
config_tasks <- attr(hub_con, "config_tasks")
},
config_tasks = checkmate::assert_list(config_tasks)
)
derived_task_ids = NULL,
hub_con = deprecated(),
config_tasks = deprecated()) {
config_tasks <- switch_get_config(hub_con, config_tasks, path)

if (is.null(derived_task_ids)) {
derived_task_ids <- get_config_derived_task_ids(
config_tasks, round_id
Expand Down Expand Up @@ -231,13 +267,12 @@ message_opt_tasks <- function(na_cols, n_mt) {
if (n_mt > 1L) {
msg <- c(
msg,
"!" = "Round contains more than one modeling task ({.val {n_mt}})"
"!" = "Round contains more than one modeling task (n = {.val {n_mt}})"
)
}
msg <- c(
msg,
"i" = "See Hub's {.path tasks.json} file or {.cls hub_connection} attribute
{.val config_tasks} for details of optional
"i" = "See Hub's {.path tasks.json} file for details of optional
task ID/output_type/output_type ID value combinations."
)
cli::cli_bullets(msg)
Expand Down Expand Up @@ -265,3 +300,58 @@ subset_complete_cases <- function(tmpl_df) {
)
tmpl_df[compl_cases, ]
}

# This function handles issuing deprecation warnings for older arguments
# and returns a config_tasks list according to the input argument.
switch_get_config <- function(hub_con, config_tasks, path) {
input_arg <- rlang::check_exclusive(hub_con, config_tasks, path,
.frame = parent.frame()
)
switch(input_arg,
hub_con = {
# Signal the deprecation to the user
lifecycle::deprecate_warn(
"0.11.0",
"hubValidations::submission_tmpl(hub_con = )",
"hubValidations::submission_tmpl(hub_path = )"
)
checkmate::assert_class(hub_con, classes = "hub_connection")
attr(hub_con, "config_tasks")
},
config_tasks = {
lifecycle::deprecate_warn(
"0.11.0",
"hubValidations::submission_tmpl(config_tasks = )",
"hubValidations::submission_tmpl(hub_path = )"
)
checkmate::assert_list(config_tasks)
},
path = {
is_s3_dir <- inherits(path, "SubTreeFileSystem") &&
hubUtils::is_s3_base_fs(path)

invalid_github_url <- !inherits(path, "SubTreeFileSystem") &&
hubUtils::is_github_url(path) &&
!hubUtils::is_github_repo_url(path)

if (invalid_github_url) {
cli::cli_abort(
c(
"x" = "GitHub URL {.url {path}} is invalid.",
"i" = "Please supply either a {.url github.com} URL to the repository
root directory or a {.url raw.githubusercontent.com} URL to the raw
contents of a {.path tasks.json} file. See examples for details."
),
call = rlang::caller_env(1)
)
}

is_dir <- is.character(path) && fs::path_ext(path) == ""
if (is_s3_dir || is_dir) {
read_config(path)
} else {
read_config_file(path)
}
}
)
}
21 changes: 21 additions & 0 deletions man/figures/lifecycle-deprecated.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions man/figures/lifecycle-experimental.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 29 additions & 0 deletions man/figures/lifecycle-stable.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions man/figures/lifecycle-superseded.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading