Assignment 8: Tidyverse, data visualization

Due by 05:00 PM on Thursday, October 22, 2020

To do yourself

To submit on blackboard, due 10-22-2020, 5:00pm

  • We will start with a very basic ggplot. Read in the mtcars dataset that comes with R. Produce a histogram of the MPG variable for each value of the cylinders variable (there are 3) in a 1 row by 3 column grid. You may notice that the plots appear ‘sideways’. If so, correct it using a ggplot command. Give the plot an informative title.

  • Read in the diamonds dataset in R. In one long string of pipe operations, keep only those observations with price less than the 80th percentile of the price variable, group observations by Clarity type, and create a summary dataframe that summarizes the median price in one column and summarizes the median x value in another column. Using this dataset, create a ggplot scatterplot with x on the x-axis and price on the y-axis. Label the plot with values of Clarity and adjust the labels slightly so that they do not overlap the points.

  • Again using the diamonds dataset, starting from the full dataset. First remove all diamonds with an ‘Ideal’ cut. Then make a ggplot of the data that displays the density estimate of price for each of the (now 4) values of cut, separately, arranged in a 4 row by 1 column grid. Fill in the area below the density curve in each plot with some color. Give the plot a title.