From 36cd24bdca00af1a8e5e45934fd9f001ea46bda8 Mon Sep 17 00:00:00 2001 From: LiNk-NY Date: Wed, 15 Jan 2025 11:36:50 -0500 Subject: [PATCH] add data-raw section and refer to r-pkgs textbook --- documentation.Rmd | 23 +++++++++++++++++------ package-data.Rmd | 5 +++-- 2 files changed, 20 insertions(+), 8 deletions(-) diff --git a/documentation.Rmd b/documentation.Rmd index 073b3f2..b4c0057 100644 --- a/documentation.Rmd +++ b/documentation.Rmd @@ -178,15 +178,26 @@ If this option is used it will also be preferable to use `donttest` instead of The scripts in this directory can vary. Most importantly if data was included in the `inst/extdata/` directory, a -related script must be present in this directory documenting very clearly how -the data was generated and source information. +related script must be present in this directory clearly documenting how the +data was obtained and prepared. -It should include source URLs and any key information regarding filtering or processing. +It should include any source URLs and information regarding filtering or +pre-processing. -It can be executable code, sudo code, or a text description. +It can be executable code, sudo code, or even a text description. -Users should be able to download and be able to roughly reproduce the file or -object that is present as data. +Users should be able to download and roughly reproduce the data file or object +in `inst/extdata/`. + +## The `data-raw` directory {#doc-data-raw} + +The `data-raw` directory can also be used to store scripts that were used to +generate data files or objects. Note that one potential disadvantage of using +the `data-raw` directory is that the scripts are not included in the package +installation (for users to inspect). For more details, see the `Data` chapter in +Hadley Wickham's [R Packages][r-pkgs]. + +[r-pkgs]: https://r-pkgs.org/data.html ## Other {#other-doc} diff --git a/package-data.Rmd b/package-data.Rmd index 6b50325..2e519e4 100644 --- a/package-data.Rmd +++ b/package-data.Rmd @@ -83,8 +83,9 @@ another package or the hubs as previously stated. However, if this is not applicable, raw data files should be included in the `inst/extdata` directory. Files of these type are often accessed utilizing -`system.file()`. _Bioconductor_ requires documentation on these files in an -`inst/scripts/` directory. See [data documentation](#doc-inst-scripts). +`system.file()`. _Bioconductor_ requires documentation of these files in either +`inst/scripts/` or `data-raw` directory. See [inst/scripts](#doc-inst-scripts) +and [data-raw](#doc-data-raw) documentation. ### Internal data