Project Reporting with RMarkdown

Introduction

RMarkdown  provides an authoring system for project and data science reporting.  RMarkdown is a core component of the RStudio IDE.  It braids together narrative text with embedded chunks of R code.  The  R code serves to demonstrate the model concepts in the text.  RMarkdown  produces elegantly formatted document output, including  publication quality data plots and tables.

RMarkdown integrates several applications into an easy to use framework.  The core components include the R programming language and Markdown, which is a light weight markup language for text creation.  RMarkdown then relies on the knitr package to call the R interpreter and to produce model results, report tables and charts.  The pandoc application is called next and serves as a document conversion tool that renders text in different output formats.  pandoc also has the ability to render {\footnotesize \LaTeX} commands for state-of-the-art typesetting, equation editing and document control.

Benefits of RMarkdown

The benefits of combining text authoring with scientific programming are significant:

  • Reported results are immediately reproducible given embedded code and data objects;
  • Code will often detail incremental calculations that the academic text might overlook given the need to simplify text content and equations
  • Reports can be easily updated as data or code objects are updated … no  copy/paste effort across files or applications is required;
  • Collaboration and communication is greatly enhanced within teams and across teams;
  • Dozens of output formats are supported, catering to a wide variety of authoring needs.

Perhaps most important, the RStudio IDE delivers integrated code and text authoring so the entire process is easy to implement and use.

Dependencies

The following applications need to be in place, regardless of the operating system used:

  • R programming language
  • RStudio
  • Tex (markup and typesetting language for high quality {\footnotesize \LaTeX})
  • Compilers for other programming languages (optional)

The RMarkdown packages also need to be installed, which install dependent packages like knitr, among others:

How It Works

The following sequence defines the standard RMarkdown work flow:

  1. Open a new .Rmd file in RStudio with File ▶ New File ▶ 
RMarkdown. Use the wizard that opens (shown at left) to pre-populate the file with a template options;
  2. Write document by editing template with text narration and code chunks (examples below);
  3. Knit document to create report by using point-and-click knit button to render the text document or use the render() command in console to knit;
  4. Preview document output in the IDE window;
  5. Publish file output (optional) to a web server;
  6. Examine build log in RMarkdown console in case of errors;
  7. Use output files (such as *.tex, *.pdf, *.doc, etc) that are created  by pandoc and saved with the original*.Rmd file.

RMarkdown Example

The plain text below is an RMarkdown file with the extension *.Rmd:

The plain text file contains three important types of content:

  1. A header section surrounded by with core document controls;
  2. Blocks of R code surrounded by `; and
  3. Text narration mixed with simple text formatting like ##sub-heading and  **bold**.

The example above is very simple in nature with only minimal formatting of text, data and plots.  The rendered output looks like this:

Document control and quality are explored next.

Text Formatting

Some basic text formatting commands appear below:

SyntaxDescription
*italics*
_italics_
italics text
**bold**
__bold__
bold text
superscript^^2^^superscript text
--strikethrough--strikethrough text
[link](www.google.com)hyperlink
# Header 1
## Header 2
### Header 3
#### Header 4
##### Header 5
###### Header 6
Headers with bold text in various font sizes
$A = \pi * r^(2)$inline equation
![](path/to/file.png)image insertion
* item1
* item2
+ sub-itemA
+ sub-itemB
bullet list
1. item1
2. item2
+ sub-itemA
+ sub-itemB
ordered list
```{r}
paste("Hello", "World")
```
code chunk
`r pase("Hello", "World")`.inline code

Top quality document formatting can also be achieved by embedding {\footnotesize \LaTeX} markdown commands into the Rmarkdown file.  An  intro to {\footnotesize \LaTeX} and many common commands can be found here

Global Document Options

Document output formats are controlled by option controls in the RMarkdown header. The following example shows an expanded header section.  The output section is now set to produce a PDF file and an HTML file for web-site use. Hence. multiple output documents can produced simultaneously:

Looking closer at the header details above, several observations are worth sharing:

  • The output options will save the {\footnotesize \LaTeX} *.tex file created by pandoc to render the PDF file. This can be useful for debugging large markdown documents (either in RStudio or in Tex, where more detailed debugging options are available);
  • The PDF output has been configured to include section and figure numbering, which is typical of more formal documents.
  • The header-includes section now attaches a large number {\footnotesize \LaTeX} packages for high quality document control. Included packages enable equation editing, figure inserting and layout placement, non-standard font colors, footnoting, improved hyperlinks, and the ability to switch page layouts from portrait to landscape within a single document.
  • Finally, the header defines a sub-title, page geometry or dimensions and specifies a European paper size.

For large documents, the header section expands during document development as new options are required. As a result, some headers can be substantial.

Document Format Types

The following output formats are available to use with RMarkdown.  The supplied links provide detailed format instructions for setting up the header of the RMarkdown file:

Documents

Presentations (slides)

More

You can also build books ⧉, websites ⧉, and interactive documents ⧉ with RMarkdown. There are also package solutions that utilize RMarkdown to format text to the specs of different academic journals.

Code Chunk Options

The next section defines options to control the behavior of code chucks in the RMarkdown file:

OptionDefault ValueDescription
EVALUATION
childNULL A character vector of filenames. Knitr will knit the files and place them into the main document.
codeNULLSet to R code. Knitr will replace the code in the chunk with the code in the code option.
engine'R'Knitr will evaluate the chunk in the named language, e.g. engine = 'python'. Run names(knitr::knit_engines$get()) to
see supported languages.
evalTRUEIf FALSE, knitr will not run the code in the code chunk.
includeTRUE If FALSE, knitr will run the chunk but not include the chunk in the final document.
purlTRUE If FALSE, knitr will not include the chunk when running purl() to extract the source code.
RESULTS
collapseFALSE If TRUE, knitr will collapse all the source and output blocks created by the chunk into a single block.
echoTRUE If FALSE, knitr will not display the code in the code chunk above it’s results in the final document.
results'markup'If 'hide', knitr will not display the code’s results in the final document. If 'hold', knitr will delay displaying all output pieces until the end of the chunk. If 'asis', knitr will pass through results without reformatting them (useful if results return raw HTML, etc.)
errorTRUEIf FALSE, knitr will not display any error messages generated by the code.
messageTRUEIf FALSE, knitr will not display any messages generated by the code.
warningTRUEIf FALSE, knitr will not display any warning messages generated by the code.
CODE FORMAT
comment'##'A character string. Knitr will append the string to the start of each line of results in the final document.
highlightTRUEIf TRUE, knitr will highlight the source code in the final output.
promptFALSEIf TRUE, knitr will add > to the start of each line of code displayed in the final document
strip.whiteTRUEIf TRUE, knitr will remove white spaces that appear at the beginning or end of a code chunk.
tidyFALSEIf TRUE, knitr will tidy code chunks for display with the tidy_source() function in the formatR package.
CHUNKS
opts.labelNULLThe label of options set in knitr:: opts_template() to use with the chunk.
R.optionsNULLLocal R options to use with the chunk. Options are set with options() at start of chunk. Defaults are restored at end.
ref.lablesNULLA character vector of labels of the chunks from which the code of the current chunk is inherited.
CACHE
autodepFALSEIf TRUE, knitr will attempt to figure out dependencies between chunks automatically by analyzing object names.
cacheFALSEIf TRUE, knitr will cache the results to reuse in future knits. Knitr will reuse the results until the code chunk is altered.
cache.commentsNULLIf FALSE, knitr will not rerun the chunk if only a code comment has changed.
cache.lazyTRUEIf TRUE, knitr will use lazyload() to load objects in chunk. If FALSE, knitr will use load() to load objects in chunk.
cache.path'cache/'A file path to the directory to store cached results in. Path should begin in the directory that the .Rmd file is saved in.
cache.varsNULLA character vector of object names to cache if you do not wish to cache each object in the chunk.
ANIMATION
anipots'controls, loopExtra options for animations (see the animate package).
interval1The number of seconds to pause between animation frames.
PLOTS
dev'png'The R function name that will be used as a graphical device to record plots, e.g. dev='CairoPDF'.
dev.argsNULLArguments to be passed to the device, e.g. dev.args=list(bg='yellow', pointsize=10).
dpi72A number for knitr to use as the dots per inch (dpi) in graphics (when applicable).
externalTRUEIf TRUE, knitr will externalize tikz graphics to save LaTex compilation time (only for the tikzDevice::tikz() device).
fig.align'default'How to align graphics in the final document. One of 'left', 'right', or 'center'.
fig.capNULLA character string to be used as a figure caption in LaTex.
fig.env'figure'The Latex environment for figures.
fig.extNULLThe file extension for figure output, e.g. fig.ext='png'.
fig.height
fig.width
7The width and height to use in R for plots created by the chunk (in inches).
fig.keep'high'If 'high', knitr will merge low-level changes into high level plots. If 'all', knitr will keep all plots (low-level changes may
produce new plots). If 'first', knitr will keep the first plot only. If 'last', knitr will keep the last plot only. If 'none', knitr
will discard all plots.
fig.lp'fig:'A prefix to be used for figure labels in latex.
fig.path'figure/'A file path to the directory where knitr should store the graphics files created by the chunk.
fig.pos"A character string to be used as the figure position arrangement in LaTex.
fig.processNULLA function to post-process a figure file. Should take a filename and return a filename of a new figure source.
fig.retina1Dpi multiplier for displaying HTML output on retina screens.
fig.scapNULLA character string to be used as a short figure caption.
fig.subcapNULLA character string to be used as captions in sub-figures in LaTex.
fig.show'as.is'If 'hide', knitr will generate the plots created in the chunk, but not include them in the final document. If 'hold', knitr
will delay displaying the plots created by the chunk until the end of the chunk. If 'animate', knitr will combine all of
the plots created by the chunk into an animation.
fig.showtext'as.is'If TRUE, knitr will call showtext::showtext.begin() before drawing plots.
out.extra'as.is'A character string of extra options for figures to be passed to LaTex or HTML.
out.height
out.width
'as.is'The width and height to scale plots to in the final output. Can be in units recognized by output, e.g. 8\\linewidth, 50px
resize.height
resize.width
'as.is'The width and height to resize tike graphics in LaTex, passed to \resizebox{}{}.
sanitizeFALSEIf TRUE, knitr will sanitize tike graphics for LaTex.

Use of Other Programming Languages

It is important to acknowledge that code chunks from other languages can also be integrated into RMarkdown documents. The support for multiple languages comes from the knitr package, which has a large number of language engines.  For example, there are 50 programming languages supported by knitr:

To use a different language engine, simply change the language name in the chunk header from R to the engine name:

Shell scripts like bash (for the Linux and OSX operating systems) can also be run in RMarkdown as follows:

Back | Next