eda_qqmat
Generates a matrix of empirical QQ plots
Usage
eda_qqmat(
dat,
x,
fac,
p = 1L,
tukey = FALSE,
q.type = 5,
diag = TRUE,
xylim = NULL,
resid = FALSE,
stat = mean,
plot = TRUE,
grey = 0.6,
pch = 21,
p.col = "grey40",
p.fill = "grey60",
size = 1,
text.size = 1,
tail.pch = 21,
tail.p.col = "grey70",
tail.p.fill = NULL,
tic.size = 0.7,
alpha = 0.8,
q = FALSE,
tails = TRUE,
med = TRUE,
inner = 0.75,
...
)
Arguments
- dat
Data frame.
- x
Continuous variable.
- fac
Categorical variable.
- p
Power transformation to apply to the continuous variable.
- tukey
Boolean determining if a Tukey transformation should be adopted (FALSE adopts a Box-Cox transformation).
- q.type
An integer between 1 and 9 selecting one of the nine quantile algorithms. (See
quantile
tile function).- diag
Boolean determining if both upper and lower triangular matrix should be plotted. If set to
FALSE
, only the lower triangular matrix is plotted.- xylim
X and Y axes limits.
- resid
Boolean determining if residuals should be plotted. Residuals are computed using the
stat
parameter.- stat
Statistic to use if residuals are to be computed. Currently
mean
(default) ormedian
.- plot
Boolean determining if plot should be generated.
- grey
Grey level to apply to plot elements (0 to 1 with 1 = black).
- pch
Point symbol type.
- p.col
Color for point symbol.
- p.fill
Point fill color passed to
bg
(Only used forpch
ranging from 21-25).- size
Point symbol size (0-1).
- text.size
Size for category text in diagonal box.
- tail.pch
Tail-end point symbol type (See
tails
).- tail.p.col
Tail-end color for point symbol (See
tails
).- tail.p.fill
Tail-end point fill color passed to
bg
(Only used fortail.pch
ranging from 21-25).- tic.size
Size of tic labels (defaults to 0.8).
- alpha
Point transparency (0 = transparent, 1 = opaque). Only applicable if
rgb()
is not used to define point colors.- q
Boolean determining if grey box highlighting the
inner
region should be displayed.- tails
Boolean determining if points outside of the
inner
region should be symbolized differently. Tail-end points are symbolized via thetail.pch
,tail.p.col
andtail.p.fill
arguments.- med
Boolean determining if median lines should be drawn.
- inner
Fraction of mid-values to highlight in
q
ortails
. Defaults to the inner 75% of values.- ...
Not used
Value
Returns a list with the following components:
data
: List with inputx
andy
values for each group. May be interpolated to smallest quantile batch if batch sizes don't match. Values will reflect power transformation defined inp
.p
: Transformation applied to original values.
Details
The function will generate an empirical QQ plot matrix from a
dataframe of continuous values and matching categories. The function is
designed to place emphasis on the mid portion of the data. The mid portion
range is defined by inner
(the inner fraction of the data). By default,
the points outside of the mid portion of the data are symbolized differently.
You can also highlight the mid region in light grey by setting
q = TRUE
. The median of both batches are shown in vertical and
horizontal dashed lines. For a plain vanilla QQ plot matrix you can remove
all guides by setting tails = FALSE
and mid = FALSE
.
The QQ plot matrix is most effective in comparing residuals after the data
are fitted by the mean or median. To plot the residuals, set
resid=TRUE
. By default, the mean
is used. You can change the
statistic to the median by setting stat=median
.
The function also allows for batch transformation of values via the
p
argument. The transformation is applied to the data prior to
computing the residuals.
References
John M. Chambers, William S. Cleveland, Beat Kleiner, Paul A. Tukey. Graphical Methods for Data Analysis (1983)
Examples
# Default output
singer <- lattice::singer
eda_qqmat(singer, height, voice.part)
# Limit to lower triangular matrix
eda_qqmat(singer, height, voice.part, diag = FALSE)
# Plot residuals after fitting mean model
eda_qqmat(singer, height, voice.part, resid = TRUE)
# Generate plain vanilla QQ plot matrix
eda_qqmat(mtcars, mpg, cyl,resid = TRUE, tails = FALSE, med = FALSE)
# Log transform the data, then plot the residuals after fitting the mean model
eda_qqmat(iris, Petal.Length, Species, resid = TRUE, p = 0)
#> Note that a power transformation of 0 was applied to the data before they were processed for the plot.
# Fit the median model instead of the mean
eda_qqmat(iris, Petal.Length, Species, resid = TRUE, p = 0, stat = median)
#> Note that a power transformation of 0 was applied to the data before they were processed for the plot.
# Fill inner region with grey boxes
eda_qqmat(iris, Petal.Length, Species, resid = TRUE, q = TRUE, p = 0)
#> Note that a power transformation of 0 was applied to the data before they were processed for the plot.
# Change tail point symbol
eda_qqmat(iris, Petal.Length, Species, resid = TRUE, p = 0, tail.pch = 3)
#> Note that a power transformation of 0 was applied to the data before they were processed for the plot.
# Change inner region point symbols to dark orange and reduce size of all
# point symbols
eda_qqmat(iris, Petal.Length, Species, resid = TRUE, p = 0, size = 0.8,
tail.pch = 3, p.fill = "darkorange")
#> Note that a power transformation of 0 was applied to the data before they were processed for the plot.