-
Notifications
You must be signed in to change notification settings - Fork 11
/
Copy pathdata-visualization-exercises.Rmd
239 lines (144 loc) · 4.96 KB
/
data-visualization-exercises.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
---
title: "Fundamentals of Data Visualization Exercises"
author: "R for the Rest of Us"
output:
html_document:
css: slides/style.css
toc: true
toc_depth: 1
toc_float: true
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
The exercises are part of the Fundamentals of R course. For more, see the [R for the Rest of Us website](https://rfortherestofus.com/courses/fundamentals/).
# Load Packages
Let's load the packages we need. These include `tidyverse` (especially the `dplyr` package) and `janitor`.
```{r}
# YOUR CODE HERE
```
# Import NHANES Data
Import your data into a data frame called NHANES. Use `clean_names` to make your variable names easy to work with.
```{r}
# YOUR CODE HERE
```
# Scatterplot
Make a scatterplot that shows weight on the x axis and height on the y axis.
```{r}
# YOUR CODE HERE
```
# Histogram
Make a histogram that shows the distribution of the weight variable.
```{r}
# YOUR CODE HERE
```
Copy your code from above, but adjust it so that there are 50 bins.
```{r}
# YOUR CODE HERE
```
# Bar Chart
## Bar Chart v1
Use the v1 approach to make a bar chart that shows a count of the number of people who say they smoke. Include NA responses.
```{r}
# YOUR CODE HERE
```
## Bar Chart v2
Create a new data frame called `sleep_by_gender` that shows the average amount of sleep per night that males and females report getting. Drop any NA (or NaN) responses from this data frame.
```{r}
# YOUR CODE HERE
```
Plot the average amount of sleep per night for males and females.
```{r}
# YOUR CODE HERE
```
Make the same graph as above, but use `geom_col` instead of `geom_bar`.
```{r}
# YOUR CODE HERE
```
# `color` and `fill`
Take your graph from above (the one with `geom_col`) and make the inside of each bar a different color.
```{r}
# YOUR CODE HERE
```
Make your scatterplot from before that shows weight on the x axis and height on the y axis again, but make the dots show up in different colors based on the `phys_active` variable.
```{r}
# YOUR CODE HERE
```
# Scales
## color
Take your scatterplot that you just made and add a scale using `scale_color_brewer`. Take a look at the help docs and choose a palette other than the default (hint: look at the `palette` argument).
```{r}
# YOUR CODE HERE
```
Do nearly the same thing to change the color of the last bar chart you made (the one about sleep and gender).
```{r}
# YOUR CODE HERE
```
## x and y axes
Copy the graph you just made and change the y axis so it goes from 0 to 8.
```{r}
# YOUR CODE HERE
```
Copy the last code chunk. Then adjust the breaks on the y axis so that it shows 0 to 8 by 1 (i.e. 0, 1, 2, etc).
```{r}
# YOUR CODE HERE
```
# Text and Labels
Copy your last code chunk. Then do the following:
1. Add text labels to each bar.
2. Use the `round` argument to just show one decimal place in each label.
3. Use the `vjust` argument to have them show up at the inner edge of the bars.
4. Make the labels all white.
```{r}
# YOUR CODE HERE
```
Do the same thing as above, but use `geom_label` instead of `geom_text`. Also, do the following:
1. Use the `vjust` argument to have them show up at the outer edge of the bars.
2. Don't show the legend (hint: you'll have to add code in two places to make this happen).
```{r}
# YOUR CODE HERE
```
# Plot Labels
Use the code chunk from above with `geom_text` (not the last one with `geom_label`). Do the following:
1. Add a title
2. Add a better y axis label
3. Remove the x axis label
4. Remove the legend (hint: use the `show.legend` argument again)
```{r}
# YOUR CODE HERE
```
# Themes
Install and load the [`hrbrthemes` package](https://hrbrmstr.github.io/hrbrthemes/index.html). It's a package that provides some great default themes.
It's not available on CRAN, where we normally install packages from, which means you have to install it slightly differently. You'll use the `devtools` package and then use this to install the `hrbrthemes` package. Use the code below.
```{r}
# install.packages("devtools")
# devtools::install_github("hrbrmstr/hrbrthemes")
library(hrbrthemes)
```
Then add the `theme_ipsum` to your plot.
```{r}
# YOUR CODE HERE
```
# Facets
I've created a data frame called `sleep_by_gender_by_age` for you. Run the code chunk below to load the data frame.
```{r}
sleep_by_gender_by_age <- nhanes %>%
group_by(gender, age_decade) %>%
summarize(avg_sleep = mean(sleep_hrs_night, na.rm = TRUE)) %>%
drop_na()
```
Let's take a look at `sleep_by_gender_by_age`.
```{r}
sleep_by_gender_by_age
```
Now, see if you can recreate this plot. Much of the code will be the same from your previous plots using the `sleep_by_gender` data frame so just make some small changes.
![](plots/sleep-by-gender-by-age.png)
```{r}
# YOUR CODE HERE
```
# Save Plots
Save your last plot to a PNG that is 8 inches wide and 5 inches high. Put it in the plots directory and call it "my-sleep-plot.png"
```{r}
# YOUR CODE HERE
```