Skip to content

Commit

Permalink
Merge pull request #8 from chrbknudsen/main
Browse files Browse the repository at this point in the history
questions, objectives og keypoints
  • Loading branch information
chrbknudsen authored Dec 6, 2023
2 parents 58c0bef + d70e614 commit 3f2150b
Show file tree
Hide file tree
Showing 3 changed files with 128 additions and 93 deletions.
140 changes: 82 additions & 58 deletions episodes/03-dplyr-tidyr.Rmd
Original file line number Diff line number Diff line change
@@ -1,15 +1,23 @@
---
title: "Data Wrangling with dplyr and tidyr"
keypoints:
- Use the `dplyr` package to manipulate dataframes.
- Use `select()` to choose variables from a dataframe.
- Use `filter()` to choose data based on values.
- Use `group_by()` and `summarize()` to work with subsets of data.
- Use `mutate()` to create new variables.
- Use the `tidyr` package to change the layout of dataframes.
- Use `pivot_wider()` to go from long to wide format.
- Use `pivot_longer()` to go from wide to long format.
objectives:

teaching: 20
exercises: 10

---

:::: questions

Check warning on line 9 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 9 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 9 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 9 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 9 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 9 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

- How can I select specific rows and/or columns from a dataframe?
- How can I combine multiple commands into a single command?
- How can I create new columns or remove existing columns from a dataframe?
- How can I reformat a dataframe to meet my needs?

::::


:::: objectives

Check warning on line 19 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 19 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 19 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 19 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 19 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 19 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

- Describe the purpose of an R package and the **`dplyr`** and **`tidyr`** packages.
- Select certain columns in a dataframe with the **`dplyr`** function `select`.
- Select certain rows in a dataframe according to filtering conditions with the **`dplyr`**
Expand All @@ -27,16 +35,8 @@ objectives:
- Reshape a dataframe from long to wide format and back with the `pivot_wider` and
`pivot_longer` commands from the **`tidyr`** package.
- Export a dataframe to a csv file.
questions:
- How can I select specific rows and/or columns from a dataframe?
- How can I combine multiple commands into a single command?
- How can I create new columns or remove existing columns from a dataframe?
- How can I reformat a dataframe to meet my needs?
teaching: 20
exercises: 10
source: Rmd
---

::::

```{r, include = FALSE}
library(dplyr)
Expand Down Expand Up @@ -236,22 +236,28 @@ interviews_ch
Note that the final dataframe (`interviews_ch`) is the leftmost part of this
expression.

> ## Exercise
>
> Using pipes, subset the `interviews` data to include interviews
> where respondents were members of an irrigation association
> (`memb_assoc`) and retain only the columns `affect_conflicts`,
> `liv_count`, and `no_meals`.
>
> > ## Solution
> >
> > ```{r}
> > interviews %>%
> > filter(memb_assoc == "yes") %>%
> > select(affect_conflicts, liv_count, no_meals)
> > ```
> {: .solution}
{: .challenge}

:::: challenge

Check warning on line 240 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 240 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 240 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 240 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 240 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 240 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag


## Exercise

Using pipes, subset the `interviews` data to include interviews
where respondents were members of an irrigation association
(`memb_assoc`) and retain only the columns `affect_conflicts`,
`liv_count`, and `no_meals`.

:::: solution

Check warning on line 250 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 250 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 250 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag
## Solution

```{r}
interviews %>%
filter(memb_assoc == "yes") %>%
select(affect_conflicts, liv_count, no_meals)
```

::::


### Mutate

Expand All @@ -270,29 +276,32 @@ interviews %>%



:::: challenge

Check warning on line 279 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 279 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 279 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

> ## Exercise
>
> Create a new dataframe from the `interviews` data that meets the following
> criteria: contains only the `village` column and a new column called
> `total_meals` containing a value that is equal to the total number of meals
> served in the household per day on average (`no_membrs` times `no_meals`).
> Only the rows where `total_meals` is greater than 20 should be shown in the
> final dataframe.
>
> **Hint**: think about how the commands should be ordered to produce this data
> frame!
>
> > ## Solution
> >
> > ``` {r}
> > interviews_total_meals <- interviews %>%
> > mutate(total_meals = no_membrs * no_meals) %>%
> > filter(total_meals > 20) %>%
> > select(village, total_meals)
> > ```
> {: .solution}
{: .challenge}
## Exercise

Create a new dataframe from the `interviews` data that meets the following
criteria: contains only the `village` column and a new column called
`total_meals` containing a value that is equal to the total number of meals
served in the household per day on average (`no_membrs` times `no_meals`).
Only the rows where `total_meals` is greater than 20 should be shown in the
final dataframe.

**Hint**: think about how the commands should be ordered to produce this data
frame!

:::: solution

Check warning on line 293 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 293 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 293 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

## Solution

``` {r}
interviews_total_meals <- interviews %>%
mutate(total_meals = no_membrs * no_meals) %>%
filter(total_meals > 20) %>%
select(village, total_meals)
```

::::

### Split-apply-combine data analysis and the summarize() function

Expand Down Expand Up @@ -395,6 +404,10 @@ interviews %>%
count(village, sort = TRUE)
```





> ## Exercise
>
> How many households in the survey have an average of
Expand Down Expand Up @@ -476,4 +489,15 @@ if (!dir.exists("../data_output")) dir.create("../data_output")
write_csv(interviews, "../data_output/interviews_plotting.csv")
```

{% include links.md %}
:::: keypoints

Check warning on line 492 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 492 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

Check warning on line 492 in episodes/03-dplyr-tidyr.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

check for the corresponding close tag

- Use the `dplyr` package to manipulate dataframes.
- Use `select()` to choose variables from a dataframe.
- Use `filter()` to choose data based on values.
- Use `group_by()` and `summarize()` to work with subsets of data.
- Use `mutate()` to create new variables.
- Use the `tidyr` package to change the layout of dataframes.
- Use `pivot_wider()` to go from long to wide format.
- Use `pivot_longer()` to go from wide to long format.

::::
44 changes: 25 additions & 19 deletions episodes/04-functions-plots.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,24 @@
title: "A couple of plots. And making our own functions"
teaching: 80
exercises: 35
questions:
- "How do I create scatterplots, boxplots, and barplots?"
- "How can I define my own functions?"

objectives:
- "Produce scatter plots and boxplots using Base R."
- "Write your own function"
- "Write loops to repeat calculations"
- "Use logical tests in loops"

keypoints:
- "Boxplots are useful for visualizing the distribution of a continuous variable."
- "Barplots are useful for visualizing categorical data."
- "Functions allows you to repeat the same set of operations again and again."
- "Loops allows you to apply the same function to lots of data."
- "Logical tests allow you to apply different calculations on different sets of data."

source: Rmd
---

:::: questions:

- "How do I create scatterplots, boxplots, and barplots?"
- "How can I define my own functions?"

::::


:::: objectives:

- "Produce scatter plots and boxplots using Base R."
- "Write your own function"
- "Write loops to repeat calculations"
- "Use logical tests in loops"

::::

We start by loading the **`tidyverse`** package.

Expand Down Expand Up @@ -317,4 +314,13 @@ interviews_plotting %>%
It looks different, and we get a warning about `binwidth`. geom_histogram automatically
chooses 30 bins for us, and that is normally not the right number.

{% include links.md %}
:::: keypoints

- "Boxplots are useful for visualizing the distribution of a continuous variable."
- "Barplots are useful for visualizing categorical data."
- "Functions allows you to repeat the same set of operations again and again."
- "Loops allows you to apply the same function to lots of data."
- "Logical tests allow you to apply different calculations on different sets of data."

::::

37 changes: 21 additions & 16 deletions episodes/05-whats-next.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,22 @@
title: "What is the next step?"
teaching: 10
exercises: 0
questions:
- "What do I do now?"
- "What is the next step?"

objectives:
- "Present suggestions for further reading,"
- "Tips on problems to work on to practice,"

keypoints:
- "Practice is important!"
- "Working on data that YOU find interesting is a really good idea,"
- "The amount of ressources online is immense."
- "KUB Datalab is there for your."

source: Rmd
---


:::: questions

- "What do I do now?"
- "What is the next step?"

::::

:::: objectives

- "Present suggestions for further reading,"
- "Tips on problems to work on to practice,"

::::


## Great sites
Expand Down Expand Up @@ -57,4 +55,11 @@ Our mail: [email protected]



{% include links.md %}
:::: keypoints:
- "Practice is important!"
- "Working on data that YOU find interesting is a really good idea,"
- "The amount of ressources online is immense."
- "KUB Datalab is there for your."

::::

0 comments on commit 3f2150b

Please sign in to comment.