Skip to content

Commit

Permalink
Remove trailing whitespace from all vignettes
Browse files Browse the repository at this point in the history
  • Loading branch information
DavisVaughan committed Aug 5, 2024
1 parent 14bae31 commit 8d13947
Show file tree
Hide file tree
Showing 6 changed files with 244 additions and 245 deletions.
68 changes: 34 additions & 34 deletions vignettes/in-packages.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ knitr::opts_chunk$set(

This vignette serves two distinct, but related, purposes:

* It documents general best practices for using tidyr in a package,
* It documents general best practices for using tidyr in a package,
inspired by [using ggplot2 in packages][ggplot2-packages].
* It describes migration patterns for the transition from tidyr v0.8.3 to

* It describes migration patterns for the transition from tidyr v0.8.3 to
v1.0.0. This release includes breaking changes to `nest()` and `unnest()`
in order to increase consistency within tidyr and with the rest of the
tidyverse.
Expand Down Expand Up @@ -54,7 +54,7 @@ If you know the column names, this code works in the same way regardless of whet

```{r}
mini_iris %>% nest(
petal = c(Petal.Length, Petal.Width),
petal = c(Petal.Length, Petal.Width),
sepal = c(Sepal.Length, Sepal.Width)
)
```
Expand All @@ -65,7 +65,7 @@ The easiest way to silence this note is to use `all_of()`. `all_of()` is a tidys

```{r}
mini_iris %>% nest(
petal = all_of(c("Petal.Length", "Petal.Width")),
petal = all_of(c("Petal.Length", "Petal.Width")),
sepal = all_of(c("Sepal.Length", "Sepal.Width"))
)
```
Expand All @@ -80,10 +80,10 @@ Hopefully you've already adopted continuous integration for your package, in whi

We recommend adding a workflow that targets the devel version of tidyr. When should you do this?

* Always? If your package is tightly coupled to tidyr, consider leaving this
in place all the time, so you know if changes in tidyr affect your package.
* Right before a tidyr release? For everyone else, you could add (or
* Always? If your package is tightly coupled to tidyr, consider leaving this
in place all the time, so you know if changes in tidyr affect your package.

* Right before a tidyr release? For everyone else, you could add (or
re-activate an existing) tidyr-devel workflow during the period preceding a
major tidyr release that has the potential for breaking changes, especially if you've been contacted during our reverse dependency checks.

Expand Down Expand Up @@ -127,7 +127,7 @@ Ideally, you'll tweak your package so that it works with both tidyr 0.8.3 and ti

If you use continuous integration already, we **strongly** recommend adding a build that tests with the development version of tidyr; see above for details.

This section briefly describes how to run different code for different versions of tidyr, then goes through the major changes that might require workarounds:
This section briefly describes how to run different code for different versions of tidyr, then goes through the major changes that might require workarounds:

* `nest()` and `unnest()` get new interfaces.
* `nest()` preserves groups.
Expand Down Expand Up @@ -176,37 +176,37 @@ What changed:
* The to-be-nested columns are no longer accepted as "loose parts".
* The new list-column's name is no longer provided via the `.key` argument.
* Now we use a construct like this: `new_col = <something about existing cols>`.

Why it changed:

* The use of `...` for metadata is a problematic pattern we're moving away from.
<https://design.tidyverse.org/dots-data.html>

* The `new_col = <something about existing cols>` construct lets us create
* The `new_col = <something about existing cols>` construct lets us create
multiple nested list-columns at once ("multi-nest").

```{r}
mini_iris %>%
nest(petal = matches("Petal"), sepal = matches("Sepal"))
mini_iris %>%
nest(petal = matches("Petal"), sepal = matches("Sepal"))
```

Before and after examples:

```{r eval = FALSE}
# v0.8.3
mini_iris %>%
mini_iris %>%
nest(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, .key = "my_data")
# v1.0.0
mini_iris %>%
mini_iris %>%
nest(my_data = c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width))
# v1.0.0 avoiding R CMD check NOTE
mini_iris %>%
mini_iris %>%
nest(my_data = any_of(c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")))
# or equivalently:
mini_iris %>%
mini_iris %>%
nest(my_data = !any_of("Species"))
```

Expand All @@ -230,7 +230,7 @@ What changed:
* `.sep` has been deprecated and replaced with `names_sep`.

* `unnest()` uses the [emerging tidyverse standard][name-repair]
to disambiguate duplicated names. Use `names_repair = tidyr_legacy` to
to disambiguate duplicated names. Use `names_repair = tidyr_legacy` to
request the previous approach.

* `.id` has been deprecated because it can be easily replaced by creating the column
Expand All @@ -239,7 +239,7 @@ What changed:
```{r, eval = FALSE}
# v0.8.3
df %>% unnest(x, .id = "id")
# v1.0.0
df %>% mutate(id = names(x)) %>% unnest(x))
```
Expand All @@ -248,7 +248,7 @@ Why it changed:

* The use of `...` for metadata is a problematic pattern we're moving away from.
<https://design.tidyverse.org/dots-data.html>

* The changes to details arguments relate to features rolling out
across multiple packages in the tidyverse. For example, `ptype` exposes
prototype support from the new [vctrs package](https://vctrs.r-lib.org).
Expand All @@ -258,7 +258,7 @@ Why it changed:
Before and after:

```{r, eval = FALSE}
nested <- mini_iris %>%
nested <- mini_iris %>%
nest(my_data = c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width))
# v0.8.3 automatically unnests list-cols
Expand Down Expand Up @@ -292,42 +292,42 @@ Why it changed:

If the fact that `nest()` now preserves groups is problematic downstream, you have a few choices:

* Apply `ungroup()` to the result. This level of pragmatism suggests,
* Apply `ungroup()` to the result. This level of pragmatism suggests,
however, you should at least consider the next two options.
* You should never have grouped in the first place. Eliminate the
`group_by()` call and specify which columns should be nested versus not

* You should never have grouped in the first place. Eliminate the
`group_by()` call and specify which columns should be nested versus not
nested directly in `nest()`.

* Adjust the downstream code to accommodate grouping.

Imagine we used `group_by()` then `nest()` on `mini_iris`, then we computed on the list-column *outside the data frame*.

```{r}
(df <- mini_iris %>%
group_by(Species) %>%
(df <- mini_iris %>%
group_by(Species) %>%
nest())
(external_variable <- map_int(df$data, nrow))
```

And now we try to add that back to the data *post hoc*:

```{r error = TRUE}
df %>%
df %>%
mutate(n_rows = external_variable)
```

This fails because `df` is grouped and `mutate()` is group-aware, so it's hard to add a completely external variable. Other than pragmatically `ungroup()`ing, what can we do? One option is to work inside the data frame, i.e. bring the `map()` inside the `mutate()`, and design the problem away:

```{r}
df %>%
df %>%
mutate(n_rows = map_int(data, nrow))
```

If, somehow, the grouping seems appropriate AND working inside the data frame is not an option, `tibble::add_column()` is group-unaware. It lets you add external data to a grouped data frame.

```{r}
df %>%
df %>%
tibble::add_column(n_rows = external_variable)
```

Expand All @@ -344,12 +344,12 @@ Why it changed:
- Specialized standard evaluation versions of functions, e.g., `foo_()` as a
complement to `foo()`.
- The older lazyeval framework.

Before and after:

```{r eval = FALSE}
# v0.8.3
mini_iris %>%
mini_iris %>%
nest_(
key_col = "my_data",
nest_cols = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")
Expand All @@ -358,7 +358,7 @@ mini_iris %>%
nested %>% unnest_(~ my_data)
# v1.0.0
mini_iris %>%
mini_iris %>%
nest(my_data = any_of(c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")))
nested %>% unnest(any_of("my_data"))
Expand Down
12 changes: 6 additions & 6 deletions vignettes/nest.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -72,11 +72,11 @@ df1 %>% unnest(data)

## Nested data and models

Nested data is a great fit for problems where you have one of _something_ for each group. A common place this arises is when you're fitting multiple models.
Nested data is a great fit for problems where you have one of _something_ for each group. A common place this arises is when you're fitting multiple models.

```{r}
mtcars_nested <- mtcars %>%
group_by(cyl) %>%
mtcars_nested <- mtcars %>%
group_by(cyl) %>%
nest()
mtcars_nested
Expand All @@ -85,17 +85,17 @@ mtcars_nested
Once you have a list of data frames, it's very natural to produce a list of models:

```{r}
mtcars_nested <- mtcars_nested %>%
mtcars_nested <- mtcars_nested %>%
mutate(model = map(data, function(df) lm(mpg ~ wt, data = df)))
mtcars_nested
```

And then you could even produce a list of predictions:

```{r}
mtcars_nested <- mtcars_nested %>%
mtcars_nested <- mtcars_nested %>%
mutate(model = map(model, predict))
mtcars_nested
mtcars_nested
```

This workflow works particularly well in conjunction with [broom](https://broom.tidymodels.org/), which makes it easy to turn models into tidy data frames which can then be `unnest()`ed to get back to flat data frames. You can see a bigger example in the [broom and dplyr vignette](https://broom.tidymodels.org/articles/broom_and_dplyr.html).
Loading

0 comments on commit 8d13947

Please sign in to comment.