Assignment 8: Tidyverse, data visualization
To do yourself
O’Donoghue, Sen I., Benedetta Frida Baldi, Susan J. Clark, Aaron E. Darling, James M. Hogan, Sandeep Kaur, Lena Maier-Hein, et al. “Visualization of Biomedical Data.” Annual Review of Biomedical Data Science 1, no. 1 (July 20, 2018) - Visualization best practices (the use of length, area, color etc.)
RStudio Webinars - Code and slides for RStudio webinars
Data visualization in R by Data Carpentry
ggplot2 tutorial/slides/code examples/references by Jenny Bryan
Interactive plots in R by Dave Tang
The R Graph Gallery, all graphs with code
To submit on blackboard, due 10-22-2020, 5:00pm
We will start with a very basic ggplot. Read in the
mtcars
dataset that comes with R. Produce a histogram of the MPG variable for each value of the cylinders variable (there are 3) in a 1 row by 3 column grid. You may notice that the plots appear ‘sideways’. If so, correct it using a ggplot command. Give the plot an informative title.Read in the
diamonds
dataset in R. In one long string of pipe operations, keep only those observations withprice
less than the 80th percentile of theprice
variable, group observations byClarity
type, and create a summary dataframe that summarizes the medianprice
in one column and summarizes the medianx
value in another column. Using this dataset, create a ggplot scatterplot withx
on the x-axis andprice
on the y-axis. Label the plot with values of Clarity and adjust the labels slightly so that they do not overlap the points.Again using the
diamonds
dataset, starting from the full dataset. First remove all diamonds with an ‘Ideal’ cut. Then make a ggplot of the data that displays the density estimate of price for each of the (now 4) values of cut, separately, arranged in a 4 row by 1 column grid. Fill in the area below the density curve in each plot with some color. Give the plot a title.