install.packages("remotes")
::install_github("mgimond/tukeyedar") remotes
Exploratory Data Analysis in R
Preface
This book is a compilation of lecture notes used in an Exploratory Data Analysis in R course taught to undergraduates at Colby College. The course assumes little to no background in quantitative analysis nor in computer programming and was first taught in Spring, 2015. The course introduces students to data manipulation in R, data exploration (in the spirit of John Tukey’s EDA) and the R markdown language. Many of the visualization techniques are adopted from William Cleveland’s Data Visualization book.
The base R plotting environment and the ggplot2
ecosystem are used throughout this book. While a chapter is dedicated to the lattice
plotting package, its functions are not used outside of that chapter given that ggplot2
offers many of lattice
’s functionality.
While great effort is made to adopt a consistent plotting environment throughout this book (this being ggplot2
, for the most part), a few topics (including the q-q plot and the median polish) will benefit from custom plotting functions available in the tukeyedar
package. The package can be downloaded from GitHub via the command:
Functions making use of the tukeyedar
package will be highlighted in a peach/pink code block as opposed to the default light yellow code block used for all other code blocks. For example, if tukeyedar
’s eda_qq
function is used, the code block will take on the following appearance:
library(tukeyedar)
eda_qq(Tenor, Bass)
The tukeyedar
functions are built off of base R graphics and require R vesion 4.1
or greater.
Manuel “Manny” Gimond