static.Rmd

# Static branching {#static}

```{r, message = FALSE, warning = FALSE, echo = FALSE}
knitr::opts_knit$set(root.dir = fs::dir_create(tempfile()))
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = TRUE)
```

```{r, message = FALSE, warning = FALSE, echo = FALSE, eval = TRUE}
library(targets)
library(tarchetypes)
library(tidyverse)
```

## Branching

Sometimes, a pipeline contains more targets than a user can comfortably type by hand. For projects with hundreds of targets, branching can make the `_targets.R` file more concise and easier to read and maintain. 

`targets` supports two types of branching: dynamic branching and [static branching](#static). Some projects are better suited to dynamic branching, while others benefit more from [static branching](#static) or a combination of both. Some users understand dynamic branching more easily because it avoids metaprogramming, while others prefer [static branching](#static) because `tar_manifest()` and `tar_visnetwork()` provide immediate feedback. Except for the [section on dynamic-within-static branching](static.html#dynamic-within-static-branching), you can read the two chapters on branching in any order (or skip them) depending on your needs.

## When to use static branching

Static branching is the act of defining a group of targets in bulk before the pipeline starts. Whereas dynamic branching uses last-minute dependency data to define the branches, static branching uses metaprogramming to modify the code of the pipeline up front. Whereas dynamic branching excels at creating a large number of very similar targets, static branching is most useful for smaller number of heterogeneous targets. Some users find it more convenient because they can use `tar_manifest()` and `tar_visnetwork()` to check the correctness of static branching before launching the pipeline.

## Map

[`tar_map()`](https://docs.ropensci.org/tarchetypes/reference/tar_map.html) from the [`tarchetypes`](https://github.com/ropensci/tarchetypes) package creates copies of existing target objects, where each new command is a variation on the original. In the example below, we have a data analysis workflow that iterates over datasets and analysis methods. The `values` data frame has the operational parameters of each data analysis, and `tar_map()` creates one new target per row.

```{r, echo = TRUE, eval = FALSE}
# _targets.R file:
library(targets)
library(tarchetypes)
library(tibble)
values <- tibble(
  method_function = rlang::syms(c("method1", "method2")),
  data_source = c("NIH", "NIAID")
)
targets <- tar_map(
  values = values,
  tar_target(analysis, method_function(data_source, reps = 10)),
  tar_target(summary, summarize_analysis(analysis, data_source))
)
list(targets)
```

```{r, echo = FALSE, eval = TRUE}
tar_script({
  library(targets)
  library(tarchetypes)
  library(tibble)
  values <- tibble(
    method_function = rlang::syms(c("method1", "method2")),
    data_source = c("NIH", "NIAID")
  )
  targets <- tar_map(
    values = values,
    tar_target(analysis, method_function(data_source, reps = 10)),
    tar_target(summary, summarize_analysis(analysis, data_source))
  )
  list(targets)
})
```

```{r, paged.print = FALSE, eval = TRUE}
tar_manifest()
```

```{r, eval = TRUE}
tar_visnetwork(targets_only = TRUE)
```

For shorter target names, use the `names` argument of `tar_map()`. And for more combinations of settings, use `tidyr::expand_grid()` on `values`.

```{r, eval = FALSE, echo = TRUE}
# _targets.R file:
library(targets)
library(tarchetypes)
library(tidyr)
values <- expand_grid( # Use all possible combinations of input settings.
  method_function = rlang::syms(c("method1", "method2")),
  data_source = c("NIH", "NIAID")
)
targets <- tar_map(
  values = values,
  names = "data_source", # Select columns from `values` for target names.
  tar_target(analysis, method_function(data_source, reps = 10)),
  tar_target(summary, summarize_analysis(analysis, data_source))
)
list(targets)
```

```{r, eval = TRUE, echo = FALSE}
tar_script({
  library(targets)
  library(tarchetypes)
  library(tidyr)
  values <- expand_grid(
    method_function = rlang::syms(c("method1", "method2")),
    data_source = c("NIH", "NIAID")
  )
  targets <- tar_map(
    values = values,
    names = "data_source", 
    tar_target(analysis, method_function(data_source, reps = 10)),
    tar_target(summary, summarize_analysis(analysis, data_source))
  )
  list(targets)
})
```

```{r, paged.print = FALSE, eval = TRUE}
tar_manifest()
```

```{r, eval = TRUE}
# You may need to zoom out on this interactive graph to see all 8 targets.
tar_visnetwork(targets_only = TRUE)
```

## Dynamic-within-static branching

You can even combine together static and dynamic branching. The static `tar_map()` is an excellent outer layer on top of targets with patterns. The following is a sketch of a pipeline that runs each of two data analysis methods 10 times, once per random seed. Static branching iterates over the method functions, while dynamic branching iterates over the seeds. `tar_map()` creates new patterns as well as new commands. So below, the summary methods map over the analysis methods both statically and dynamically.

```{r, eval = FALSE, echo = TRUE}
# _targets.R file:
library(targets)
library(tarchetypes)
library(tibble)
random_seed_target <- tar_target(random_seed, seq_len(10))
targets <- tar_map(
  values = tibble(method_function = rlang::syms(c("method1", "method2"))),
  tar_target(
    analysis,
    method_function("NIH", seed = random_seed),
    pattern = map(random_seed)
  ),
  tar_target(
    summary,
    summarize_analysis(analysis),
    pattern = map(analysis)
  )
)
list(random_seed_target, targets)
```

```{r, echo = FALSE, eval = TRUE}
tar_script({
  library(targets)
  library(tarchetypes)
  library(tibble)
  random_seed_target <- tar_target(random_seed, seq_len(10))
  targets <- tar_map(
    values = tibble(method_function = rlang::syms(c("method1", "method2"))),
    tar_target(
      analysis,
      method_function("NIH", seed = random_seed),
      pattern = map(random_seed)
    ),
    tar_target(
      summary,
      summarize_analysis(analysis),
      pattern = map(analysis)
    )
  )
  list(random_seed_target, targets)
})
```

```{r, eval = TRUE, paged.print = FALSE}
tar_manifest()
```

```{r, eval = TRUE, paged.print = FALSE}
tar_visnetwork(targets_only = TRUE)
```

## Combine

[`tar_combine()`](https://docs.ropensci.org/tarchetypes/reference/tar_combine.html) from the [`tarchetypes`](https://github.com/ropensci/tarchetypes) package creates a new target to aggregate the results of upstream targets. In the simple example below, our combined target simply aggregates the rows returned from two other targets.

```{r, eval = FALSE, echo = TRUE}
# _targets.R file:
library(targets)
library(tarchetypes)
library(tibble)
options(crayon.enabled = FALSE)
target1 <- tar_target(head, head(mtcars, 1))
target2 <- tar_target(tail, tail(mtcars, 1))
target3 <- tar_combine(combined_target, target1, target2)
list(target1, target2, target3)
```

```{r, echo = FALSE, eval = TRUE}
tar_script({
  library(targets)
  library(tarchetypes)
  library(tibble)
  options(crayon.enabled = FALSE)
  target1 <- tar_target(head_mtcars, head(mtcars, 1))
  target2 <- tar_target(tail_mtcars, tail(mtcars, 1))
  target3 <- tar_combine(combined_target, target1, target2)
  list(target1, target2, target3)
})
```

```{r, eval = TRUE}
tar_manifest()
```

```{r, eval = TRUE}
tar_visnetwork(targets_only = TRUE)
```

```{r, eval = TRUE}
tar_make()
```

```{r, eval = TRUE}
tar_read(combined_target)
```

To use `tar_combine()` and `tar_map()` together in more complicated situations, you may need to supply `unlist = FALSE` to `tar_map()`. That way, `tar_map()` will return a nested list of target objects, and you can combine the ones you want. The pipeline extends our previous `tar_map()` example by combining just the summaries, omitting the analyses from `tar_combine()`. Also note the use of `bind_rows(!!!.x)` below. This is how you supply custom code to combine the return values of other targets. `.x` is a placeholder for the return values, and `!!!` is the "unquote-splice" operator from the `rlang` package.

```{r, eval = FALSE, echo = TRUE}
# _targets.R file:
library(targets)
library(tarchetypes)
library(tibble)
random_seed <- tar_target(random_seed, seq_len(10))
mapped <- tar_map(
  unlist = FALSE, # Return a nested list from tar_map()
  values = tibble(method_function = rlang::syms(c("method1", "method2"))),
  tar_target(
    analysis,
    method_function("NIH", seed = random_seed),
    pattern = map(random_seed)
  ),
  tar_target(
    summary,
    summarize_analysis(analysis),
    pattern = map(analysis)
  )
)
combined <- tar_combine(
  combined_summaries,
  mapped[[2]],
  command = dplyr::bind_rows(!!!.x, .id = "method")
)
list(random_seed, mapped, combined)
```

```{r, echo = FALSE, eval = TRUE}
tar_script({
  library(targets)
  library(tarchetypes)
  library(tibble)
  random_seed <- tar_target(random_seed, seq_len(10))
  mapped <- tar_map(
    unlist = FALSE, # Return a nested list from tar_map()
    values = tibble(method_function = rlang::syms(c("method1", "method2"))),
    tar_target(
      analysis,
      method_function("NIH", seed = random_seed),
      pattern = map(random_seed)
    ),
    tar_target(
      summary,
      summarize_analysis(analysis),
      pattern = map(analysis)
    )
  )
  combined <- tar_combine(
    combined_summaries,
    mapped[[2]],
    command = dplyr::bind_rows(!!!.x, .id = "method")
  )
  list(random_seed, mapped, combined)
})
```

```{r, paged.print = FALSE, eval = TRUE}
tar_manifest()
```

```{r, eval = TRUE}
tar_visnetwork(targets_only = TRUE)
```

## Metaprogramming

Custom metaprogramming is a more flexible alternative to [`tar_map()`](https://docs.ropensci.org/tarchetypes/reference/tar_map.html) and [`tar_combine()`](https://docs.ropensci.org/tarchetypes/reference/tar_combine.html). [`tar_eval()`](https://docs.ropensci.org/tarchetypes/reference/tar_eval.html) from [`tarchetypes`](https://github.com/ropensci/tarchetypes) accepts an arbitrary expression and iteratively plugs in symbols. Below, we use it to branch over datasets. 

```{r, eval = FALSE, echo = TRUE}
# _targets.R
library(rlang)
library(targets)
library(tarchetypes)
string <- c("gapminder", "who", "imf")
symbol <- syms(string)
tar_eval(
  tar_target(symbol, get_data(string)),
  values = list(string = string, symbol = symbol)
)
```

```{r, echo = FALSE, eval = TRUE}
tar_script({
  library(rlang)
  library(tarchetypes)
  string <- c("gapminder", "who", "imf")
  symbol <- syms(string)
  tar_eval(
    tar_target(symbol, get_data(string)),
    values = list(string = string, symbol = symbol)
  )
})
```

[`tar_eval()`](https://docs.ropensci.org/tarchetypes/reference/tar_eval.html) has fewer guardrails than [`tar_map()`](https://docs.ropensci.org/tarchetypes/reference/tar_map.html) or [`tar_combine()`](https://docs.ropensci.org/tarchetypes/reference/tar_combine.html), so [`tar_manifest()`](https://docs.ropensci.org/targets/reference/tar_manifest.html) is especially important for checking the correctness of your metaprogramming.

```{r, eval = TRUE}
tar_manifest(fields = command)
```