Assignment 8: Tidyverse, data visualization
To do yourself
O’Donoghue, Sen I., Benedetta Frida Baldi, Susan J. Clark, Aaron E. Darling, James M. Hogan, Sandeep Kaur, Lena Maier-Hein, et al. “Visualization of Biomedical Data.” Annual Review of Biomedical Data Science 1, no. 1 (July 20, 2018) - Visualization best practices (the use of length, area, color etc.)
RStudio Webinars - Code and slides for RStudio webinars
Data visualization in R by Data Carpentry
ggplot2 tutorial/slides/code examples/references by Jenny Bryan
Interactive plots in R by Dave Tang
The R Graph Gallery, all graphs with code
To submit on blackboard, due 10-22-2020, 5:00pm
We will start with a very basic ggplot. Read in the
mtcarsdataset that comes with R. Produce a histogram of the MPG variable for each value of the cylinders variable (there are 3) in a 1 row by 3 column grid. You may notice that the plots appear ‘sideways’. If so, correct it using a ggplot command. Give the plot an informative title.Read in the
diamondsdataset in R. In one long string of pipe operations, keep only those observations withpriceless than the 80th percentile of thepricevariable, group observations byClaritytype, and create a summary dataframe that summarizes the medianpricein one column and summarizes the medianxvalue in another column. Using this dataset, create a ggplot scatterplot withxon the x-axis andpriceon the y-axis. Label the plot with values of Clarity and adjust the labels slightly so that they do not overlap the points.Again using the
diamondsdataset, starting from the full dataset. First remove all diamonds with an ‘Ideal’ cut. Then make a ggplot of the data that displays the density estimate of price for each of the (now 4) values of cut, separately, arranged in a 4 row by 1 column grid. Fill in the area below the density curve in each plot with some color. Give the plot a title.