Review
What is literate programming? Why is it important for reproducible research?
Introduction to Markdown
Introduction to R Markdown.
Simple webpages
PDF papers
Presentations
22 February 2016
Review
What is literate programming? Why is it important for reproducible research?
Introduction to Markdown
Introduction to R Markdown.
Simple webpages
PDF papers
Presentations
Example files are at:
Due: Midnight 4 March
Learning objectives: develop your understanding of
file structures,
version control,
basic R data structures and descriptive statistics.
Each pair will create a new public GitHub repository
Must be fully documented, including with a descriptive README.md file. Your code must be human readable and clearly commented.
Include R source code files that:
Access at least two core R data sets
Illustrate the datas' distributions using a variety of relevant descriptive statistics
Two files must be dynamically linked
Another pair makes a pull request. And this is discussed/merged.
In R, what is the difference between a matrix and a data frame?
What is the assignment operator?
What is the component selector?
What two things do you need to describe the distribution of a continuous variable?
With a partner:
What is the difference between relative and absolute file paths?
What is a commit?
What does it mean to pull and push a repo?
Donald Knuth (1992): explanation of a program using natural language interspersed with code snippets that are compilable by a computer.
This produces two representations of the program:
A formatted easily human readable document (e.g. a paper).
Source code that can be compiled by a computer.
Creates better programs. Programmers have to explicitly state thoughts and in so doing find flaws.
Clear documentation so that others can understand and build on the program more easily.
Quantitative social science is computer programming.
You are creating a program that gathers and analyses data.
You then advertise this work (a paper) in a way that is completely understandable to others.
Added benefit: allows you to automatically update documents when there are changes.
In addition to the computer language, we need:
Natural language part formatted using a markup language. Markup language: typesetting instructions. E.g. Markdown, \(\LaTeX\), HTML.
A way to tangle or weave the computer language part into the natural language part.
In R you can use Yihui Xie's knitr package.
Language dependent:
.Rmd .html (using Markdown)
.Rnw .pdf (using \(\LaTeX\))
Note to use knitr in RStudio you need go to Preferences > Sweave > Weave Rnw files using: knitr
Two parts:
Natural language part written in intended markup language.
R code (or almost any other language on your system) written in code chunks.
Most of the focus is on RStudio's R Markdown.
Directly builds on knitr (Yihui works at RStudio now).
But uses Pandoc to be more output agnostic.
Originally created by John Gruber to be an easy way to:
write HTML files
that are human readable as text files.
HTML:
<h1>A header</h1> <p>This is some text with a <a href="http://www.example.com">link</a></p> <p>Here is some <strong>bold</strong> text.</p>
Markdown:
# A header This is some text with a [link](http://www.example.com). Here is some **bold** text.
# Header 1 ## Header 2 ### Header 3
And so on.
Horizontal lines:
---
Bold text:
**bold**
Italics:
*italics*
Links:
[link](http://www.example.com)
Images:

Unordered Lists:
- An item - An item - An item
Ordered Lists:
1. Item one 2. Item two 3. Item three
| Name | Something | | ------ | --------- | | Stuff | Things | | Things | Stuff |
Name | Something |
---|---|
Stuff | Things |
Things | Stuff |
R Markdown from RStudio supports MathJax. So, you can write any \(\LaTeX\) math with R Markdown.
Inline equations have one dollar sign $s^2 = \frac{\sum(x - \bar{x})^2}{n - 1}$.
Inline equations have one dollar sign \(s^2 = \frac{\sum(x - \bar{x})^2}{n - 1}\).
Display equations have two dollar signs:
$$s^2 = \frac{\sum(x - \bar{x})^2}{n - 1}$$
\[s^2 = \frac{\sum(x - \bar{x})^2}{n - 1}\]
You can include any HTML syntax in a Markdown document. You can also change the formatting by adding a custom CSS file (just like a website).
However, this will only render in HTML output.
If you are using \(\LaTeX\) (other than math syntax), you can also include \(\LaTeX\) syntax in your RMarkdown document for rendering as a PDF.
To use syntax highlighting on code chunks inline with the text, surround your text with ``
Knitable inline chunks with a back-tick then r
.
For example:
Two plus two equals `r 2 + 2`.
Produces:
Two plus two equals 4.
Use three ticks (```) to start and end a code chunk that is not run.
Create a knit-able code chunk begin the chunk with ```{r}
.
You can turn any matrix or data frame into a well formatted table with the knitr function kable
.
knitr::kable(mtcars)
Make sure that the code chunk option results='asis'
.
This R Markdown file can be compiled to PDF (via \(\LaTeX\)) or MS Word with RStudio.
Change how R Markdown chunks behave with options. Place options in the chunk head: ```{r echo=FALSE, error=FALSE}
Option | What it Does |
---|---|
echo=FALSE |
Does not print the code only the output |
error=FALSE |
Does not print errors |
include=FALSE |
Does not include the code or output, but does run the code |
fig.width |
Sets figure width |
cache=TRUE |
Cache the chunk. It is only run when the contents change. |
Many others at http://yihui.name/knitr/options
These lecture slides are created using R Markdown.
All of the syntax is the same, except:
##
Does not mean Header 2. It is creates a new slide and title.
You can create a slide with no title using ---
.
The header lets you make changes to the whole document.
This presentation's head is:
--- title: 'MPP-E1180 Lecture 4: Intro to Markup Lang. & Literate Programming (1)' author: "Christopher Gandrud" date: "22 February 2016" output: ioslides_presentation: css: https://maxcdn.bootstrapcdn.com/font-awesome/4.4.0/css/font-awesome.min.css logo: https://raw.githubusercontent.com/christophergandrud/Hertie_Collab_Data_Science/master/img/HertieCollaborativeDataLogo_v1.png beamer_presentation: default ---
We'll look at headers more next lecture
For a really good RMarkdown cheatsheet see: https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf.
Convert what work you have done on your Pair Assignment 1 to R Markdown and output to multiple formats.
Create a basic R Markdown presentation.
Begin trying to find a partner for the Collaborative Research Project.
Discuss topics you might be interested in researching.