forked from hadley/r-pkgs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathintro.rmd
129 lines (81 loc) · 8.29 KB
/
intro.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
layout: default
title: Introduction
output: bookdown::html_chapter
---
# Introduction
## Why build a package?
In R, the fundamental unit of shareable code is the package. A package bundles together code, data, documentation and tests, and it's easy to share with others. Packages actually make your life easier because they're a set of conventions for how to structure your code. The first convention is a no-brainer:
* Put your R code in a directory called `R/`.
The second is a little bit more work:
* Describe what your package needs to work in a `DESCRIPTION` file
If the package is to share, also describe what it does, who can use it
(the license) and who to contact if things go wrong.
These conventions are helpful because:
* They save you time --- you don't need to think about the best way to organise
a project, you can just follow a template.
* Standardised conventions lead to standardised tools --- if you buy into
the R's package conventions, you get many tools for free.
As Hilary Parker puts it in her [introduction to packages](http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/): "Seriously, it doesn't have to be about sharing your code (although that is an added benefit!). It is about saving yourself time."
A package doesn't need to be complicated. You can start with a minimal subset of useful features and slowly build up over time. While there are strict requirements if you want to publish a package to the world on CRAN (and many of those requirements are useful even for your own packages), most packages won't end up on CRAN. You can even use packages to structure your data analyses, as Robert M Flight discusses in a [series of blog posts](http://rmflight.github.io/posts/2014/07/analyses_as_packages.html). Packages are really easy to create and use once you have the right set of tools.
Anytime you create some reusable set of functions you should put it in a package. It's the easiest path because packages come with conventions: you don't need to figure them out for yourself. You'll start with just your R code in the `R/` directory, and over time you can flesh it out with documentation (in `man/`), vignettes (in `vignettes/`), compiled code (in `src/`), data sets (in `data/`), and tests (in `inst/tests`).
The most accurate resource for up-to-date details on package development is always the official [writing R extensions][r-ext] guide. However, it's rather hard to understand if you're not already familiar with the basics of packages. It's also exhaustive, covering every possible package component, rather than focussing on the most common and useful components as this book does. Writing R extensions is a useful resource once you've mastered the basics and need to check on more esoteric facts.
## Philosophy
This book espouses my philosophy of package development: anything that can be automated, should be automated. You will do as little as possible by hand, and instead rely functions to do as much work as possible. The goal is to let you spend your time thinking about what you want your package to do, not thinking about the minutiae of package structure.
This philosophy is realised primarily through the devtools package, which provides a suite of R functions to automate common development tasks. The goal of the devtools package is to make package development as painless as possible. Devtools takes what I know about package development best practices, and encodes them in functions so you don't have to remember, or even know about the best way to tackle a problem. As I discover better ways of doing things, all you need to do to reap the benefits is to upgrade devtools.
If you've developed packages without devtools, it may take a bit of getting used to. However, I strongly believe that mastering it will save you much much more time than it will take to learn. Devtools is a key component of my package development productivity.
Devtools works hand-in-hand with RStudio, which I believe is the best development environment for most R users. The only real competitor is [ESS](http://ess.r-project.org/), emacs speaks statistics, which is a rewarding environment if you're willing to put in the time to learn emacs and customise it to your needs.
## Getting started
To get started, make sure you have the latest version of R (at least 3.1.0, which was current when I wrote this book), then run the following code to get the packages you'll need:
```{r, eval = FALSE}
install.packages(c("devtools", "roxygen2", "testthat", "knitr"))
```
Make sure you have a recent version of RStudio. Since you're making the jump from an R user to a package developer, I recommend that you download the preview version from <http://www.rstudio.com/products/rstudio/download/preview/>. This will let you access the latest and greatest features, and only slightly increases your chances of finding a bug.
During the development of the book, you'll probably also want the development version of devtools. This will let you access new devtools functions as I develop them:
```{r, eval = FALSE}
devtools::install_github("hadley/devtools")
```
You'll also need to make sure you have a C compiler and other need command line tools. If you're on windows or the mac, Rstudio will install these for you if you don't have them already. Otherwise:
* On Windows, download and install [Rtools](http://cran.r-project.org/bin/windows/Rtools/). NB: this is not an R package!
* On Mac, make sure you have either XCode (free, available in the app store)
or the ["Command Line Tools for Xcode"](http://developer.apple.com/downloads)
(needs a free apple id).
* On Linux, make sure you've installed not only R, but the R development
devtools. This is a Linux package called something like `r-base-dev`.
You can check you have everything installed and working by running this code:
```{r, eval = FALSE}
library(devtools)
has_devel()
#> '/Library/Frameworks/R.framework/Resources/bin/R' --vanilla CMD SHLIB foo.c
#>
#> clang -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG
#> -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include
#> -fPIC -Wall -mtune=core2 -g -O2 -c foo.c -o foo.o
#> clang -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup
#> -single_module -multiply_defined suppress -L/usr/local/lib -o foo.so foo.o
#> -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework
#> -Wl,CoreFoundation
[1] TRUE
```
It will print out some compilation code (this is needed to help diagnose problems), but you're only interested in whether it returns `TRUE` (everything's ok) or an error (which you need to investigate further).
## Package basics
This book is arranged according to the structure of an R package. It starts with the most commonly used components, and then gets more esoteric.
* `R/`: where your R code lives in `.R` files.
* `DESCRIPTION`: metadata about the package
* `man/`: function documentation
* `vignettes/`: long-form documentation which show how to combine multiple
parts of your package to solve real problems.
* `NAMESPACE`: ensures that your package plays nicely with others.
* `tests/`: stores unit tests that ensure that your package is operating as
designed.
* `data/`: sample datasets (or other R objects)
* `src/`: compiled C, C++ and fortran source code
## Conventions
Throughout this book I use `f()` to refer to functions, `g` to refer to variables and function parameters, and `h/` to paths.
Larger code blocks intermingle input and output. Output is commented so that if you have an electronic version of the book, e.g., <http://r-pkgs.had.co.nz>, you can easily copy and paste examples into R. Output comments look like `#>` to distinguish them from regular comments.
## Colophon
This book was written in [Rmarkdown](http://rmarkdown.rstudio.com/) inside [RStudio](http://www.rstudio.com/ide/). [knitr](http://yihui.name/knitr/) and [pandoc](http://johnmacfarlane.net/pandoc/) converted the raw Rmarkdown to html and pdf. The [website](http://adv-r.had.co.nz) was made with [jekyll](http://jekyllrb.com/), styled with [bootstrap](http://getbootstrap.com/), and automatically published to Amazon's [S3](http://aws.amazon.com/s3/) by [travis-ci](https://travis-ci.org/). The complete source is available from [github](https://github.com/hadley/r-pkgs).
```{r}
devtools::session_info()
```
[r-ext]:http://cran.r-project.org/doc/manuals/R-exts.html#Creating-R-packages