```{r setup}
library(tidyverse)
```
## Exercise 01: select
Use [diamonds](https://ggplot2.tidyverse.org/reference/diamonds.html) example table from `ggplot2` package.
Select only `price`, `cut`, and `color` columns. Use piping and `select`.
```{r exercise 01-1}
```
Select only `price`, `cut`, and `color` columns. Use base R (no piping or select).
```{r exercise 01-2}
```
Select all columns except for `x`, `y`, and `z`. Use negative selection and piping.
```{r exercise 01-3}
```
Take a look at the [manual](https://dplyr.tidyverse.org/reference/select.html). Use `:` notation to select columns from `carat` till `clarity`.
```{r exercise 01-4}
```
## Exercise 02: conditions 1
Examine the code and decide on whether a condition will be `TRUE` or `FALSE`. Write down the answer and run the chunk to check the answer.
**Answer 1:**
```{r exercise 02-1}
x <- 4
(x > 3)
```
**Answer 2:**
```{r exercise 02-2}
x <- 4
y <- 5
(x != y)
```
## Exercise 03: conditions 2
Examine the code and decide on whether a condition will be `TRUE` or `FALSE`. Write down the answer and run the chunk to check the answer.
**Answer 1:**
```{r exercise 03-1}
x <- 4
!(x < 3)
```
**Answer 2:**
```{r exercise 03-2}
x <- 4
y <- 5
!(x != y)
```
## Exercise 04: conditions 3
Examine the code and decide on whether a condition will be `TRUE` or `FALSE`. Write down the answer and run the chunk to check the answer.
**Answer 1:**
```{r exercise 04-1}
x <- 4
(x < 3) | (x >=4)
```
**Answer 2:**
```{r exercise 04-2}
x <- 4
y <- 6
(x == y) | ((x < 5) & (y > 5))
```
**Answer 3:**
```{r exercise 04-3}
x <- 3
y <- 3
(x != y) & ((x <= 3) & (y >= 3))
```
**Answer 4:**
```{r exercise 04-4}
x <- 3
y <- 3
(x != y) | ((x <= 3) & (y >= 3))
```
## Exercise 05: conditions
Examine the code and decide on whether a condition will be `TRUE` or `FALSE` for _each_ value in the vector. Write down the answer and run the chunk to check the answer.
**Answer 1:**
```{r exercise 05-1}
x <- seq(-4, 4, 2)
(x == 0)
```
**Answer 2:**
```{r exercise 05-2}
x <- seq(-4, 4, 2)
y <- 1:5
(x + y <= 0)
```
**Answer 3:**
```{r exercise 05-3}
x <- seq(-4, 12, 4)
y <- 1:5
(x > y)
```
**Answer 4:**
```{r exercise 05-4}
x <- seq(-4, 12, 4)
y <- 1:5
(x > y) & ((x < 10) | (y < 3))
```
## Exercise 06: logical indexing
Use logical indexing to values of `x` that are above 3.
```{r exercise 06-1}
x <- -3:7
# Your code goes here
```
Use logical indexing to extract values of `x` when it is higher than value in `y`
```{r exercise 06-2}
x <- -3:7
y <- seq(5, -2, length.out = 11)
# Your code goes here
```
Use logical indexing to extract values of `x` that are greater than the mean value of x (avoid using intermediate variables).
```{r exercise 06-3}
x <- -3:7
# Your code goes here
```
## Exercise 07: filter
Use [msleep](https://ggplot2.tidyverse.org/reference/msleep.html) example table from `ggplot2` package.
Filter the table to keep only `"domesticated"` animals (variable name is `conservation`).
```{r exercise 07-1}
```
Filter the table to keep only the omnivores (`vore`) that sleep more than 10 hours in total.
```{r exercise 07-2}
```
Filter the table to _exclude_ animals with body weight above `50`.
```{r exercise 07-3}
```
Filter the table to exclude animals with body weight below [median](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/median.html) body weight. Hint, you do not need a temporary variable.
```{r exercise 07-4}
```
## Exercise 08: arrange
Use [presidential](https://ggplot2.tidyverse.org/reference/presidential.html) example table from _ggplot2_ package.
Arrange the table by party.
```{r exercise 08-1}
```
Arrange the table by start data in the descending order.
```{r exercise 08-2}
```
Arrange the table by party (descending order) and president's name (ascending order).
```{r exercise 08-3}
```
## Exercise 09: order
Repeat exercises 8-1 and 8-2 but using `order` instead of `arrange`.
```{r exercise 09-1}
```
```{r exercise 09-2}
```
## Exercise 10: mutate
Use [msleep](https://ggplot2.tidyverse.org/reference/msleep.html) example table from `ggplot2` package.
Compute new variable / column with sleep to body-weight ratio.
```{r exercise 10-1}
```
Compute new variable / column `is_big` of logical value that are `TRUE` if body_weight is above `50`.
```{r exercise 10-2}
```
Compute three new variables / columns. First, compute the _mean_ and _standard deviation_ of total sleep variable. You will get the same value in all the rows, make sure that your columns are _not_ called `mean` and `sd`. Then compute a z-score of the total sleep: $Z(x) = (x - mean(x)) / sd(x)$. Put all computations into a single mutate code.
```{r exercise 10-3}
```
Compute z-score for body weight variable, do not store mean and standard deviation in extra columns but use them directly.
```{r exercise 10-4}
```
## Exercise 11: group_by / summarize / mutate
Use [msleep](https://ggplot2.tidyverse.org/reference/msleep.html) example table from `ggplot2` package.
Group by `vore` variable and compute 1) average total sleep, 2) standard deviation of the total sleep (see [sd](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/sd.html) function or type `?sd` in console).
```{r exercise 11-1}
```
Group by `vore` and compute z-score for body weight for each animal. Do you need `summarize` or `mutate` for this?
```{r exercise 11-2}
```
Group by `vore`, compute z-score for total sleep for each animal and when summarize to compute an average sleep z-score per `vore`? Do numbers make sense?
```{r exercise 11-3}
```
## Exercise 12
Use [diamonds](https://ggplot2.tidyverse.org/reference/diamonds.html) table from `ggplot2` package.
1. Filter out diamonds with `"Fair"` `cut`.
2. Select only `price`, `carat`, and `color` columns.
3. Compute price-per-carat for each diamond.
4. Compute average price-per-carat for each color.
5. Arrange table by descending price-per-carat.
Use a **single** pipe that performs the entire computation. Keep it organized with one verb per line.
```{r exercise 12}
```