Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use fun() instead of fun across docs, fixes #383 #521

Merged
merged 9 commits into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@

* Formatting improvement: package names are now not in backticks anymore (@agmurray, #525).

* Improved documentation and formatting: function names are now more easily identifiable through either `()` at the end or being links to the function documentation (@brshallo , #521).

## Bug fixes

* `vfold_cv()` now utilizes the `breaks` argument correctly for repeated cross-validation (@ZWael, #471).
Expand Down
2 changes: 1 addition & 1 deletion R/boot.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
#' @param times The number of bootstrap samples.
#' @param apparent A logical. Should an extra resample be added where the
#' analysis and holdout subset are the entire data set. This is required for
#' some estimators used by the `summary` function that require the apparent
#' some estimators used by the [summary()] function that require the apparent
#' error rate.
#' @export
#' @return A tibble with classes `bootstraps`, `rset`, `tbl_df`, `tbl`, and
Expand Down
8 changes: 4 additions & 4 deletions R/caret.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
#' \pkg{rsample} and \pkg{caret}.
#'
#' @param object An `rset` object. Currently,
#' `nested_cv` is not supported.
#' @return `rsample2caret` returns a list that mimics the
#' [nested_cv()] is not supported.
#' @return `rsample2caret()` returns a list that mimics the
#' `index` and `indexOut` elements of a
#' `trainControl` object. `caret2rsample` returns an
#' `trainControl` object. `caret2rsample()` returns an
hfrick marked this conversation as resolved.
Show resolved Hide resolved
#' `rset` object of the appropriate class.
#' @export
rsample2caret <- function(object, data = c("analysis", "assessment")) {
Expand All @@ -23,7 +23,7 @@ rsample2caret <- function(object, data = c("analysis", "assessment")) {
}

#' @rdname rsample2caret
#' @param ctrl An object produced by `trainControl` that has
#' @param ctrl An object produced by `caret::trainControl()` that has
#' had the `index` and `indexOut` elements populated by
#' integers. One method of getting this is to extract the
#' `control` objects from an object produced by `train`.
Expand Down
2 changes: 1 addition & 1 deletion R/form_pred.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#' Extract Predictor Names from Formula or Terms
#'
#' `all.vars` returns all variables used in a formula. This
#' While [all.vars()] returns all variables used in a formula, this
#' function only returns the variables explicitly used on the
#' right-hand side (i.e., it will not resolve dots unless the
#' object is terms with a data set specified).
Expand Down
7 changes: 3 additions & 4 deletions R/labels.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
#' Find Labels from rset Object
#'
#' Produce a vector of resampling labels (e.g. "Fold1") from
#' an `rset` object. Currently, `nested_cv`
#' is not supported.
#' an `rset` object. Currently, [nested_cv()] is not supported.
#'
#' @param object An `rset` object
#' @param object An `rset` object.
#' @param make_factor A logical for whether the results should be
#' a character or a factor.
#' @param ... Not currently used.
Expand Down Expand Up @@ -68,7 +67,7 @@ labels.rsplit <- function(object, ...) {
#' For a data set, `add_resample_id()` will add at least one new column that
#' identifies which resample that the data came from. In most cases, a single
#' column is added but for some resampling methods, two or more are added.
#' @param .data A data frame
#' @param .data A data frame.
#' @param split A single `rset` object.
#' @param dots A single logical: should the id columns be prefixed with a "."
#' to avoid name conflicts with `.data`?
Expand Down
2 changes: 1 addition & 1 deletion R/make_groups.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
#' only one) assessment set, but rather allow each observation to be in an
#' assessment set zero-or-more times. As a result, those functions don't have
#' a `balance` argument, and under the hood always specify `balance = "prop"`
#' when they call [make_groups()].
#' when they call `make_groups()`.
#'
#' @keywords internal
make_groups <- function(data,
Expand Down
2 changes: 1 addition & 1 deletion R/nest.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#' Nested or Double Resampling
#'
#' `nested_cv` can be used to take the results of one resampling procedure
#' `nested_cv()` can be used to take the results of one resampling procedure
#' and conduct further resamples within each split. Any type of resampling
#' used in rsample can be used.
#'
Expand Down
2 changes: 1 addition & 1 deletion R/permutations.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
#' by permuting/shuffling one or more columns. This results in analysis
#' samples where some columns are in their original order and some columns
#' are permuted to a random order. Unlike other sampling functions in
#' rsample, there is no assessment set and calling `assessment()` on a
#' rsample, there is no assessment set and calling [assessment()] on a
#' permutation split will throw an error.
#'
#' @param data A data frame.
Expand Down
2 changes: 1 addition & 1 deletion R/printing.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## The `pretty` methods below are good for when you need to
## The `pretty()` methods below are good for when you need to
## textually describe the resampling procedure. Note that they
## can have more than one element (in the case of nesting)

Expand Down
12 changes: 6 additions & 6 deletions R/reg_intervals.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,18 @@
#'
#' @param formula An R model formula with one outcome and at least one predictor.
#' @param data A data frame.
#' @param model_fn The model to fit. Allowable values are "lm", "glm",
#' "survreg", and "coxph". The latter two require that the `survival` package
#' @param model_fn The model to fit. Allowable values are `"lm"`, `"glm"`,
#' `"survreg"`, and `"coxph"`. The latter two require that the survival package
#' be installed.
#' @param type The type of bootstrap confidence interval. Values of "student-t" and
#' "percentile" are allowed.
#' @param type The type of bootstrap confidence interval. Values of `"student-t"` and
#' `"percentile"` are allowed.
#' @param times A single integer for the number of bootstrap samples. If left
#' NULL, 1,001 are used for t-intervals and 2,001 for percentile intervals.
#' `NULL`, 1,001 are used for t-intervals and 2,001 for percentile intervals.
#' @param alpha Level of significance.
#' @param filter A logical expression used to remove rows from the final result, or `NULL` to keep all rows.
#' @param keep_reps Should the individual parameter estimates for each bootstrap
#' sample be retained?
#' @param ... Options to pass to the model function (such as `family` for `glm()`).
#' @param ... Options to pass to the model function (such as `family` for [stats::glm()]).
#' @return A tibble with columns "term", ".lower", ".estimate", ".upper",
#' ".alpha", and ".method". If `keep_reps = TRUE`, an additional list column
#' called ".replicates" is also returned.
Expand Down
6 changes: 3 additions & 3 deletions R/rsplit.R
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,12 @@ as.integer.rsplit <-
#'
#' The analysis or assessment code can be returned as a data
#' frame (as dictated by the `data` argument) using
#' `as.data.frame.rsplit`. `analysis` and
#' `assessment` are shortcuts.
#' `as.data.frame.rsplit()`. `analysis()` and
#' `assessment()` are shortcuts.
#' @param x An `rsplit` object.
#' @param row.names `NULL` or a character vector giving the row names for the data frame. Missing values are not allowed.
#' @param optional A logical: should the column names of the data be checked for legality?
#' @param data Either "analysis" or "assessment" to specify which data are returned.
#' @param data Either `"analysis"` or `"assessment"` to specify which data are returned.
#' @param ... Not currently used.
#' @examples
#' library(dplyr)
Expand Down
14 changes: 7 additions & 7 deletions R/tidy.R
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
#' Tidy Resampling Object
#'
#' The `tidy` function from the \pkg{broom} package can be used on `rset` and
#' The `tidy()` function from the \pkg{broom} package can be used on `rset` and
#' `rsplit` objects to generate tibbles with which rows are in the analysis and
#' assessment sets.
#' @param x A `rset` or `rsplit` object
#' @param x A `rset` or `rsplit` object
#' @param unique_ind Should unique row identifiers be returned? For example,
#' if `FALSE` then bootstrapping results will include multiple rows in the
#' sample for the same row in the original data.
#' @inheritParams rlang::args_dots_empty
#' @return A tibble with columns `Row` and `Data`. The latter has possible
#' values "Analysis" or "Assessment". For `rset` inputs, identification columns
#' are also returned but their names and values depend on the type of
#' resampling. `vfold_cv` contains a column "Fold" and, if repeats are used,
#' another called "Repeats". `bootstraps` and `mc_cv` use the column
#' "Resample".
#' values "Analysis" or "Assessment". For `rset` inputs, identification
#' columns are also returned but their names and values depend on the type of
#' resampling. For [vfold_cv()], contains a column "Fold" and, if repeats are
#' used, another called "Repeats". [bootstraps()] and [mc_cv()] use the column
#' "Resample".
#' @details Note that for nested resampling, the rows of the inner resample,
#' named `inner_Row`, are *relative* row indices and do not correspond to the
#' rows in the original data set.
Expand Down
2 changes: 1 addition & 1 deletion man/add_resample_id.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/as.data.frame.rsplit.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/bootstraps.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/form_pred.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/group_bootstraps.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 2 additions & 3 deletions man/labels.rset.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/make_groups.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion man/make_strata.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/nested_cv.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/permutations.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 6 additions & 6 deletions man/reg_intervals.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/rsample2caret.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 6 additions & 6 deletions man/tidy.rsplit.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 2 additions & 3 deletions vignettes/Working_with_rsets.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ example[1:10, setdiff(names(example), names(attrition))]

For this model, the `.fitted` value is the linear predictor in log-odds units.

To compute this data set for each of the 100 resamples, we'll use the `map` function from the purrr package:
To compute this data set for each of the 100 resamples, we'll use the `map()` function from the purrr package:

```{r model_purrr, warning=FALSE}
library(purrr)
Expand Down Expand Up @@ -182,8 +182,7 @@ The calculated 95% confidence interval contains zero, so we don't have evidence

## Bootstrap Estimates of Model Coefficients

Unless there is already a column in the resample object that contains the fitted model, a function can be used to fit the model and save all of the model coefficients. The [broom package](https://cran.r-project.org/package=broom) package has a `tidy` function that will save the coefficients in a data frame. Instead of returning a data frame with a row for each model term, we will save a data frame with a single row and columns for each model term. As before, `purrr::map()` can be used to estimate and save these values for each split.

Unless there is already a column in the resample object that contains the fitted model, a function can be used to fit the model and save all of the model coefficients. The [broom package](https://cran.r-project.org/package=broom) package has a `tidy()` function that will save the coefficients in a data frame. Instead of returning a data frame with a row for each model term, we will save a data frame with a single row and columns for each model term. As before, `purrr::map()` can be used to estimate and save these values for each split.

```{r coefs}
glm_coefs <- function(splits, ...) {
Expand Down
Loading