knitr | rmarkdown |
---|---|
1.45 | 2.25 |
29 R markdown document
29.1 Introduction
An R markdown document is a text file usually ending with an .Rmd
extension. It allows one to embed R code chunks and their output into a comprehensive report thus eliminating the possibility of loading the wrong figure into the document, or forgetting to update a statistical summary in the text when the original data file was revised.
Creating an R markdown output from an Rmd file requires knitting the file as opposed to running the code as you would an R script. The RStudio interface provides you with a knit button at the top of its interface. The knit button also allows you to choose the output format (HMTL, Word or PDF). You can also knit an Rmd file in R using the render
function from the rmarkdown
package. For example,
::render("HW16.Rmd") rmarkdown
29.2 The YAML header
The YAML header controls the look and feel of your document. At the very least, your R markdown document should contain the following YAML header sandwiched between two sets of ---
:
--- title: "Your document title" author: "ES 218" output: html_document: default ---
Make sure that the html_document: default
line is indented at least two spaces. If you intend on creating a Word document, substitute html_document
with word_document
.
--- title: "Your document title" author: "ES 218" output: word_document: default ---
The YAML header can take on several parameters. For example, to add the current date, add:
date: '`r format(Sys.time(), "%d %B, %Y")`'
The above chunk makes use of an inline code chunk that will be discussed later in this tutorial. Note the mix of single quotes and back ticks that wrap the inline code. The %d
, %B
and %y
parameters specify the date format. You can read more on date formats here.
To have the document automatically generate a table of contents add toc: true
to the html_document
or word_document
header. Make sure that the toc
parameter is indented at least two spaces from the xxx_document
header:
... output: html_document: toc: true
The above generates a static TOC. If you want to generate a floating TOC, add toc_float: true
.
... output: html_document: toc: true toc_float: true
29.3 Code folding
Rmarkdown offers the option to interactively collapse the code chunks in a knitted document. This may not be an option to have in a final report, but it may prove useful for a technical document where both code and output are to be shared. Code folding option is set with code_folding: ...
. The options are hide
to collapse the code chunks by default and show
to reveal the code chunks by default.
... output: html_document: toc: true toc_float: true code_folding: hide
29.4 Section headers
You can add section headers to your document by preceding the header with one or more hashtags. Each hashtag represents one heading level. For example, the top heading level is #
and the third heading level is ###
.
The top header hashtag is usually avoided because its default font size tends to be too big. It’s not uncommon to see R markdown files assign the top level to ##
.
## Use this as a top section level Some text ### Use this as the second section level Some text #### Use this as the third section level etc...
Use this as a top section level
Some text
Use this as the second section level
Some text
Use this as the third section level
etc…
29.5 Text formats
The markdown language has several built-in text formatting options. A brief summary of some their syntax follows:
- Italic: To italicize text, wrap it in asterisks as in
*this is italicized*
. Note that you do not want spaces between the asterisks and the text. - Bold: To bold text, wrap it with a pair of asterisks
**this is bold**
. - Web links: To create web links wrap the text with
[ ]
followed by the web link wrapped with( )
as in[ES 218 website](https://mgimond.github.io/ES218)
. Make sure that there are no spaces between[]
and()
. - Lists: To create lists in your document, precede each list item with an asterisk followed by a space. For example:
* First list element * Second list element * Third list element
- First list element
- Second list element
- Third list element
- Block equations: You can embed Latex block equations using double dollar signs,
$$ y_{edu} = \frac{1 + x}{x} + \varepsilon $$
which generates,
\[ y_{edu} = \frac{1 + x}{x} + \varepsilon \]
- Inline equations: You can also add inline Latex equations using single dollar signs,
The equation $x(1 + x)$ can be re-written as $x + x^2$.
which generates,
The equation \(x(1 + x)\) can be re-written as \(x + x^2\).
29.6 Code chunks
To embed a code chunk, simply wrap the code between ```{r}
and ```
.
```{r} plot(hp ~ mpg, mtcars) ```
Code chunks can take on many options. Examples of a few common options follow:
echo
: If you don’t want the code chunks to appear in the ouput, setecho=FALSE
.include
: If you want neither the code chunk nor its ouput displayed in the output, setinclude=FALSE
.fig.width
andfig.height
: These parameters control a figure’s height and width (in inches).warning
andmessage
: Some functions will output warnings or messages, most of which you probably do not want in your output document. To hide these, setwarning
andmessage
toFALSE
.
An example of a code chunk with a few of the aformentioned parameter follows:
```{r message=FALSE, warning=FALSE, echo=TRUE, fig.width=3, fig.height=2} plot(hp ~ mpg, mtcars) ```
Here’s the output (note that echo
was set to TRUE
in this example):
plot(hp ~ mpg, mtcars)
29.7 Document wide code chunk options
You can apply document wide code chunk options. For example, to avoid adding message=FALSE
and warning=FALSE
to each chunk of code, you can add this single chunk of code to the beginning of your Rmd file.
```{r include=FALSE} knitr::opts_chunk$set(message=FALSE, warning=FALSE) ```
29.8 Inline code chunks
If statistical summaries or data derived vectors are to populate text in your document, it’s best to do so as inline code chunks. For example, you might want to embed the mean mpg
value from the mtcars
dataset in your text. The traditional approach is to compute the mean in an R console as follows,
mean(mtcars$mpg)
[1] 20.09062
then transcribe this value into your document as follows:
The mean miles per gallon is 20.1 mpg.
An automated way to do this is to embed the code chunk directly into your text by wrapping it between `r (backtic followed by the letter r) and another backtick.
The mean miles per gallon is `r mean(mtcars$mpg)` mpg.
which generates:
The mean miles per gallon is 20.090625 mpg.
To control the precision, you can wrap the output with round
,
The mean miles per gallon is `r round(mean(mtcars$mpg), 1)` mpg.
The mean miles per gallon is 20.1 mpg.
If the code chunk becomes too long and unwieldy to embed in your text, you can create an object from that code in a separate chunk of code, then reference that object inline. For example:
```{r include=FALSE} M <- lm(mpg ~ hp, mtcars) r.sq <- round(summary(M)$r.square, 3) ``` The modeled r-square between miles-per-gallon and engine horsepower is `r r.sq`.
The code chunk is hidden from the output, but the object r.sq
is created nonetheless and converted to its value in the inline code chunk. The output thus looks like:
The modeled r-square between miles-per-gallon and engine horsepower is 0.602.
29.9 Tables
You can create two types of tables: static tables where you manually populate the cell values, and dynamic tables which are populated with R data tables.
29.9.1 Static tables
Here’s an example of a static table syntax:
column 1 Column 2 column3 ----------- ----------- ------------ val1 2.3 apple val2 5 orange val3 0.34 kiwi
column 1 | Column 2 | column3 |
---|---|---|
val1 | 2.3 | apple |
val2 | 5.0 | orange |
val3 | 0.34 | kiwi |
Note how the left and right adjusted columns in the output reflect the left and right adjusted columns in the above syntax. It’s important that the column elements not extend beyond the dashed line extents.
29.9.2 Dynamic tables
There are many R packages that specialize in table output formats such as xtable
and stargazer
. However, decent tables can be created with knitr
’s kable
function in conjunction with kableExtra
. Note that this requires the magrittr
package (if the pipe %>%
is used). However, if dplyr
is used elsewhere in the Rmd document, the magrittr
package can be omitted. Here’s an example:
library(magrittr) knitr::kable( head(mtcars), format="html" ) %>% kableExtra::kable_styling(bootstrap_options = "striped", full_width = FALSE, position = "left")
mpg | cyl | disp | hp | |
---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160 | 110 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 |
Datsun 710 | 22.8 | 4 | 108 | 93 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 |
Valiant | 18.1 | 6 | 225 | 105 |
If the output file format is a Word document, substitute format = "html"
with format = "pandoc"
.
For more kableExtra
options, visit its website.
29.10 A complete example
Here’s what a complete Rmd file might look like:
--- title: "A simple example" author: "ES 218" output: html_document: toc: true editor_options: chunk_output_type: console --- ```{r include=FALSE} knitr::opts_chunk$set(message = FALSE, warning = FALSE, echo = TRUE) ``` ## A basic plot ```{r fig.width = 3, fig.height = 2.5} library(ggplot2) ggplot(mtcars, aes(mpg, hp)) + geom_point() + geom_smooth(method = lm, se = FALSE) ``` ## Here's a glimpse of the data table ```{r echo = FALSE} knitr::kable(head((mtcars), format = "html")) ``` ## A basic analysis ```{r include = FALSE} M <- lm(mpg ~ hp, mtcars) r.sq <- round(summary(M)$r.square, 2) ``` The modeled r-square between miles-per-gallon and engine horsepower is `r r.sq`.
29.11 Additional resources
- Here’s a sample Buoy data report Rmd file
- Rstudio’s website has additional R markdown configuration options.