The opensciPackR
is an R package that aims to promote the principles of open science by providing a streamlined approach to creating R packages specifically for research projects, e.g., for sharing data and code related to a published peer-reviewed paper. It is designed to ease the distribution, management, and analysis of structured research data.
The primary purpose of opensciPackR
is to streamline the research reproduction process. By allowing researchers to install a package from GitHub that includes raw data, code for data preparation, analysis, and visualization, as well as comprehensive documentation, opensciPackR
facilitates faster and more accurate research reproduction compared to downloading separate files and documentation from repositories such as OSF.
The key features of the package include:
-
Templated Structure:
opensciPackR
provides a templated structure for creating new R packages. This structure is designed in accordance with the current R package design principles introduced by Hadley Wickham and Jennifer Bryan and helps in organizing, documenting, and sharing research data and analysis. The functionopensciPackR::create_openscipkg()
prompts the user to enter basic information about the package and the author, creates custom directories, outlines documentation for data, functions, and the README.md, and generates core R scripts for data preparation and analysis. -
Data Uploading and Conversion: The function
opensciPackR::upload_data()
allows users to upload datasets in various formats (csv, xlsx, xls, dat, sav, dta, RData, rda) and converts them to CSV format for archiving. It prompts the user to assign a new name to the dataset, adhering to good naming conventions. Additionally, it enables users to enter descriptions for their dataset and automatically generates the appropriate data documentation, which can be retrieved using thehelp()
function. -
Interactive Sandboxes:
opensciPackR
integrates with thelearnr
package to create interactive sandboxes and tutorials for data exploration throughR Markdown
documents. These documents can be easily enhanced to include Shiny UI elements. This allows peer reviewers and other researchers to actively engage with the data and understand the analysis process step by step, as proposed by Tso et al. (2022). Note: This feature is still under development. -
Easy Hosting on GitHub: The R packages created by
opensciPackR
can be hosted and installed through GitHub, which makes them easily accessible to other researchers. Researchers can fork the packages and build additional analysis plans or data comparisons on top of the existing work. -
Standardized Functions: Every R package created by
opensciPackR
provides the same two primary functions,prepare_data()
andanalyze_data()
. Theprepare_data()
function allows for quick reproduction of the data preparation process, including cleaning, transformation, and pre-processing. Theanalyze_data()
function allows for quick reproduction of comprehensive data analysis, including summary statistics, statistical methods, and data visualization.
You can install the most recent development version of opensciPackR
from GitHub with:
# install.packages("devtools")
devtools::install_github("LKobilke/opensciPackR")
Please note that opensciPackR
is currently under development.
Here's a basic example that demonstrates the main function of opensciPackR
:
library(opensciPackR)
# Create a new opensciPackR project
create_openscipkg("MyResearchPackage")
After using the opensciPackR::create_openscipkg()
function to create your package, there are several steps you should take to fully develop, document, and distribute it. Tip: Refer to this book by Hadley Wickham and Jennifer Bryan on R package development as your primary reference whenever you encounter uncertainties. They cover every aspect of setting up an R package, including writing proper documentation and running unit tests. Also, check out these handy cheat sheets for R package development in English (No. 1), English (No. 2), and German.
Here is a step-by-step guide for what to do next:
-
Navigate to the Package Directory: Your new package has its own directory. Navigate to it using your file explorer and familiarize yourself with the structure. You'll find the R project file, named after your package, in the main directory. Additionally, you'll find raw data in the
/data-raw
subdirectory, processed data in the/data
subdirectory, and the current data documentation in the/R
subdirectory within the/R/data.R
file. Since the/R
subdirectory is where all the functions of your new package are stored, the two primary functionsprepare_data()
andanalyze_data()
are already saved there and can be customized from here. -
Open the R Project File: If it didn't open automatically during the creation of your new package, open the project file in RStudio or a similar program.
-
Update Data Documentation: Navigate to
/R/data.R
and opendata.R
. Refine the data documentation that has been automatically generated byopensciPackR
, e.g., by providing scales for the variables included in the dataset. -
Customize Functions: Customize the
prepare_data()
andanalyze_data()
functions in the/R
directory according to your needs and preferences. Learn how to write and organize functions with the tidyverse style guide and by referring to programming with dplyr. This is an iterative process of editing your functions and loading your entire package with thedevtools::load_all()
function (Shortcut: Ctrl/Cmd + Shift + L) to test whether your functions work as expected. -
Add More Scripts to
/R
Directory: If necessary, add new R scripts that contain additional functions for your package. Useroxygen2
comments (#') to document your functions. -
Add Tests: Integrate tests to ensure that your package functions are working correctly. You can learn about using testthat for this purpose. Store your tests in the
/tests
directory. -
Document the Package: Use the
devtools::document()
function (Shortcut: Ctrl/Cmd + Shift + D) to process yourroxygen2
comments and automatically generate the/man
files, which store the documentation. -
Check and Build the Package: Use the
devtools::check()
function (Shortcut: Ctrl/Cmd + Shift + E) to ensure your package passes all CRAN checks. This is a good practice to follow, even if you are not submitting to CRAN. -
Host on GitHub: Push your package to a GitHub repository. This makes it easily accessible to other researchers. Update the
README.md
file in your repository to document how to install and use your new package. -
Maintain and Update: Keep your package up-to-date by fixing bugs, improving functionality, and responding to user feedback.
This project is licensed under the MIT License.