-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathModule_14_Final_Report_Tips.Rmd
239 lines (152 loc) · 8.41 KB
/
Module_14_Final_Report_Tips.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
---
title: 'Module 14: Final Report Tips'
author: "Brian P Steves"
date: "`r Sys.Date()`"
output: html_document
---
# Setting Up Your Report
## Git
Use Git and or Github to store your project.
This will allow you better version control and a built in off site backup.
It will also allow you to upload your data and a URL to your repository as your final report.
If you're worried about your data, add the data file names to your .gitignore file and they won't be included in your repository on GitHub. Just be sure to add the data to your D2L final report dropbox.
## Use Relative Links
Be sure to use links relative to your projects working directory.
For example.. dat <- read.csv("data/mydata.csv") will retrieve the csv file from the "data" sub-directory of your project folder. When I attempt to reproduce your final report this should work just fine.
However, if you use absolute links, things are likely to not work as well.
For example... dat <- read.csv("C:/Users/yourname/yourproject/data/mydata.csv") isn't going to work on my computer.
## Code Style
Be consistent with your R code style. Here are a couple links to Google and Hadley Wickam's R code style guidelines.
https://google.github.io/styleguide/Rguide.xml
http://adv-r.had.co.nz/Style.html
## Comment, Comment, Comment
It never hurts to add comments.
Generally I comment on why I am doing something.
## Use "echo=FALSE" in your RMarkdown chunks
Generally speaking, you don't want to show your R code in your report. If you were writing an article about your new R package, well then show the code. If you add "echo=FALSE" to your code chunk, your commands won't show up in your output.
## Project Chunk Options.
There are many code chunk options.
http://yihui.name/knitr/options/
You can set chunk option defualts using the opts_chunk() function. This is useful so you don't have to repeat yourself along the way with things like standard figure sizes and turning warnings and messages off. If you do this, make it the first chunk of your markdown file. Note that you will need to reference the knitr library.
```{r, comment=NA, echo=FALSE}
cat("```{r echo=FALSE}")
cat("library(knitr)")
cat("opts_chunk$set(fig.width=4, fig.height=4, fig.path='figure/', ")
cat(" echo=FALSE, warning=FALSE, message=FALSE) ")
cat("```")
```
This particular set will load the knitr library and set chunk option defaults for figure size, figure location as well as turn echoes, warnings, and messages off.
## One or Many Scripts
Decide on whether you will be using one single script or multiple scripts for your project.
A single r markdown file can be used to hold all of your R code and keep everything in one nice tidy package. All you need to do is click the "Knit <format>" button in R Studio and everthing will run.
However, I generally use multiple scripts to keep things logically separated. I might have a script for any special functions, a script for data import, one for data cleaning, one for analysis, one for figure generation, and finally a markdown file to create the report. Sometimes I might combine the import and cleaning steps into a single script and the analysis and figures into another.
If you use multiple scripts, can't just source() them in to your markdown file and expect them work. For some reason the sourced in R files and the R Markdown file create different environments (i.e. locations that the variables get stored)
There are two work arounds to this.
1.) Run your scripts first and from within your R scripts, save all of the important objects (variables, tables, figures)
```{r, eval=FALSE}
# from within your R file... "project.R"
save(x,y,z,fig1,table1, file="project.Rdata")
```
then from in your R markdown file, source the R file and then load the saved objects and do something with them.
```{r, eval=FALSE}
source("project.R")
load("project.Rdata")
fig1
```
This has the added benefit of keeping all of those annoying package loading messages from appearing in your report.
2.) The second option is to render/knit your final report from a command in your R script rather than using the Knit button in R Studio. Just be sure to include the rmarkdown library in your R code.
```{r, eval=FALSE}
# from within your R file...
# A bunch of R code.
a <- 1:20
plot(a)
library(rmarkdown)
render("myReport.Rmd", "html_document")
render("myReport.Rmd", "pdf_document")
render("myReport.Rmd", "all")
```
If you have multiple R files and a final Rmd file. It might be worth creating a single R file that can be used as the master R file to source everything in the proper order.
```{r, eval=FALSE}
library(rmarkdown)
source("import.R")
source("clean.R")
source("analysis.R")
source("plots.R")
render("report.Rmd", "pdf_document")
```
Then to create your report you simply source this master R file.
## Required Packages
Chances are not everyone is going to have the same packages installed. If you know which packages are needed, you can include the following bit of code to test for installed packages, install missing ones, and then load them.
Using the chunk option of "include=FALSE" will allow the code to be run (i.e. install and load the libraries) without any of the annoying package messages that don't seem to understand "message=FALSE" and would otherwise show up in your report output.
```{r, eval=TRUE, include=FALSE }
list.of.packages <- c("ggplot2", "tidyr", "maptools")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
lapply(list.of.packages, library, character.only=TRUE, quietly=TRUE)
```
## Saving Plots as Objects
When you use ggplot2, you may have noticed that it is common to assign a ggplot to an object and then call that object to get the ggplot to show. If you use informative object names than each figure in your report can be called at will.
```{r}
library(ggplot2)
mydata <- data.frame(x=1:10, y=1:10)
fig1<-ggplot(dat=mydata, aes(x,y)) + geom_point()
fig1
```
This is useful if you have all of your analysis and figure generation in an R script. You can save the figure objects and load them in to your R Markdown file where ever you want them.
```{r}
fig1
```
Most standard plots (i.e. not ggplot2 plots) don't save to objects directly. If you want to do this with a regular plot you can record your current plot to an object using recordPlot(). Once recorded, the object can be called back at any time.
```{r}
plot(mydata)
fig2 <- recordPlot()
plot.new() ## clean up device
fig2 # redraw
```
## Formatting Tables
<style type="text/css">
.table {
width: 40%;
}
</style>
If you just print your standard R data.frames in your report they will look ugly.
```{r}
n <- 100
x <- rnorm(n)
y <- 2*x + rnorm(n)
out <- lm(y ~ x)
summary(out)$coef
```
You can use the kable() function to wrap tables for prettier output in html or pdf.
```{r kable}
n <- 100
x <- rnorm(n)
y <- 2*x + rnorm(n)
out <- lm(y ~ x)
library(knitr)
kable(summary(out)$coef, digits=2)
```
Kable will create HTML or LaTeX tables depending on how you knit your markdown file within R Studio. If you are kniting to HTML output you can add CSS to your R markdown file to further control the output format.
## Embedded LaTeX
Sometimes you might need to add a mathematical equation to your report.
```{r, eval=FALSE}
$$ \sqrt{27} $$
$$ \alpha $$
$$ \displaystyle \int_0^{\infty} f(x) \; dx $$
$$ \frac{d}{dx}\sin x=\cos x $$
$$ \frac{n!}{r!(n-r)!} $$
$$ 2H_2 + O_2 \xrightarrow{n,m}2H_2O
```
$$ \sqrt{27} $$
$$ \alpha $$
$$ \displaystyle \int_0^{\infty} f(x) \; dx $$
$$ \frac{d}{dx}\sin x=\cos x $$
$$ \frac{n!}{r!(n-r)!} $$
$$ 2H_2 + O_2 \xrightarrow{n,m}2H_2O $$
LaTeX equation formatting is tricky. There are many references out there to help and I found at least one online equation editor. But even that isn't easy to use.
https://www.codecogs.com/latex/eqneditor.php
## Report Output Format (HTML, PDF, DOC)
Each of the knitr output options has it's advantages.
HTML is great if you're putting something on the web. I use this for putting stuff on to D2L. If you know a bit about style sheets (CSS) you can have great control on how the output looks.
DOC is good if you need to create a report that someone else will edit.
PDF gives you most control on output based on LaTeX, but there is a steep learning curve. If you're lucky someone will have created a LaTeX style you can incorporate into your report. Several publishers provide these styles for some of their journals.