25 November 2016

Objectives for the week

  • Collaborative Research Project

  • Next Class

  • Review

  • Static maps with ggmap

  • Dynamic results presentation

    • Static website hosting with gh-pages

Collaborative Research Project (1)

Purposes: Pose an interesting research question and try to answer it using data analysis and standard academic practices. Effectively communicate your results to a variety of audiences in a variety of formats.

Deadline:

  • Presentation: In-class 2 December

  • Website/Paper: 16 December 2016

Collaborative Research Project (2)

The project can be thought of as a 'dry run' for your thesis with multiple presentation outputs.

Presentation: 10 minutes maximum. Engagingly present your research question and key findings to a general academic audience (fellow students).

Paper: 5,000 words maximum. Standard academic paper, properly cited laying out your research question, literature review, data, methods, and findings.

Website: An engaging website designed to convey your research to a general audience.

Collaborative Research Project (3)

Project total: 50% of your final mark.

  • 10% presentation

  • 10% website

  • 30% paper

Collaborative Research Project (4)

As always, you should submit one GitHub repository with all of the materials needed to completely reproduce your data gathering, analysis, and presentation documents.

Note: Because you've had two assignments already to work on parts of the project, I expect high quality work.

Collaborative Research Project (5)

Find one other group to be a discussant for your presentation.

The discussants will provide a quick (max 2 minute) critique of your presentation–ideas for things you can improve on your paper–pose questions.

Office hours

I will have normal office hours this week and next week.

Please take advantages of this opportunity to improve your final project.

Be prepared.

Review

  • What is the basic R syntax for a regression model?

  • What is a model function? What two parts do GLM model functions have?

  • How do you find a 95% confidence interval for a parameter point estimate (both mathematically and in R)?

  • What are some more effective ways to present results from a logistic regression to both statistical and general audiences?

ggmap

Earlier we didn't have time to cover mapping with ggmap.

We've already seen how ggmap can be used to find latitude and longitude.

library(ggmap)

places <- c('Bavaria', 'Seoul', '6 Pariser Platz, Berlin')

geocode(places)
##         lon      lat
## 1  11.49789 48.79045
## 2 126.97797 37.56654
## 3  13.37854 52.51701

ggmap: get the map

qmap(location = 'Berlin', zoom = 15)

Plot Houston crime with ggmap

Example from: Kahle and Wickham (2013)

Use crime data set that comes with ggmap

names(crime)
##  [1] "time"     "date"     "hour"     "premise"  "offense"  "beat"    
##  [7] "block"    "street"   "type"     "suffix"   "number"   "month"   
## [13] "day"      "location" "address"  "lon"      "lat"

Clean data

# find a reasonable spatial extent
qmap('houston', zoom = 13) # gglocator(2) see in RStudio

Clean data

# only violent crimes
violent_crimes <- subset(crime,
    offense != "auto theft" & offense != "theft" &
    offense != "burglary")

# order violent crimes
violent_crimes$offense <- factor(violent_crimes$offense,
    levels = c("robbery", "aggravated assault", "rape", "murder"))

# restrict to downtown
violent_crimes <- subset(violent_crimes,
                    -95.39681 <= lon & lon <= -95.34188 &
                    29.73631 <= lat & lat <= 29.78400)

Plot crime data

# Set up base map
HoustonMap <- qmap("houston", zoom = 14,  
                   source = "stamen", maptype = "toner")

# Add points
FinalMap <- HoustonMap +
                geom_point(aes(x = lon, y = lat, colour = offense),
                data = violent_crimes) +
                xlab('') + ylab('') +
                theme(axis.ticks = element_blank(),
                      axis.text.x = element_blank(),
                      axis.text.y = element_blank()) +
                guides(size = guide_legend(title = 'Offense'),
                       colour = guide_legend(title = 'Offense'))

print(FinalMap)

Interactive visualisations

When your output documents are in HTML, you can create interactive visualisations.

Potentially–though not always–more engaging and could let users explore data on their own.

Interactive visualisations

Big distinction:

Client Side: Plots are created on the user's (client's) computer. Often JavaScript in the browser. You simply send them static HTML/JavaScript needed for their browser to create the plots.

Server Side: Data manipulations and/or plots (e.g. with Shiny Server) are done on a server in R. Browsers don't come with R built in.

Hosting

There are lots of free services (e.g. GitHub Pages) for hosting webpages for client side plot rendering.

You usually have to use a paid service for server side data manipulation plotting.

Server Side Applications

You can use R to (relatively) easily create server side web applications with R.

To do this use Shiny.

We are not going to cover Shiny in the class as it usually requires a paid service to host.

Set up for Creating Websites with Client Side Visualisations

You already know how to create HTML documents with R Markdown.

results='asis' in code chunk head (not needed for some packages).

There is a growing set of tools for interactive plotting, e.g.:

Caveat



These packages simply create an interface between R and (usually) JavaScript.

Debugging often requires some knowledge of JavaScript and the DOM.

In sum: usually simple, but can be mysteriously difficult without a good knowledge of JavaScript/HTML.

 ggplot2 and plotly

The plotly package allows you to convert (most) ggplot2 plots to JavaScript.

Simply create your ggplot2 object, then pass it to ggplotly.

Using an example from last class:

mort_plot <- ggplot(data = MortalityGDP, aes(x = InfantMortality,
        y = GDPperCapita)) + geom_point()

Then . . .

library(plotly)
ggplotly(mort_plot)

Or use only plotly

plot_ly(MortalityGDP, x = ~InfantMortality, y = ~GDPperCapita,
        mode = 'markers')

Google Plots with googleVis

The googleVis package can create Google plots from R.

# Create fake data
fake_compare <- data.frame(
                country = c('2010', '2011', '2012'),
                US = c(10,13,14),
                GB = c(23,12,32))

(Example modified from googleVis Vignettes.)

googleVis simple example

library(googleVis)
line_plot <- gvisLineChart(fake_compare)
print(line_plot, tag = 'chart')

Note: To show in interactive R use plot instead of print and don't include tag = 'chart'.

Choropleth map with googleVis

library(WDI)
co2 <- WDI(indicator = 'EN.ATM.CO2E.PC', start = 2010, end = 2010)
co2 <- co2[, c('iso2c','EN.ATM.CO2E.PC')]

# Clean
names(co2) <- c('iso2c', 'CO2 Emissions per Capita')
co2[, 2] <- round(log(co2[, 2]), digits = 2)

# Plot
co2_map <- gvisGeoChart(co2, locationvar = 'iso2c',
                      colorvar = 'CO2 Emissions per Capita',
                      options = list(
                          colors = "['#fff7bc', '#d95f0e']"
                          ))

CO2 Emissions (metric tons per capita)

print(co2_map, tag = 'chart')

More Examples

Hosting a website on GitHub Pages

Set Up GitHub Pages

First create a new branch in your repository called gh-pages:

Set Up GitHub Pages

Then sync your branch with the local version of the repository.


Finally switch to the gh-pages branch.

GitHub Pages and R Markdown

You can use R Markdown to create the `index.html page.

Simply place a new .Rmd file in the repository called index.Rmd and knit it to HTML. Then sync it.

Your website will now be live.

Every time you push to the gh-pages branch, the website will be updated.

Note




Note branches in git repositories can have totally different files from one another.

Example: networkD3

Interactive dashboards with flexdashboard

You can create interactive 'dashboards' for displaying an information overview using the flexdashboard package.

flexdashboard syntax

flexdashboard builds on R Markdown.

To set a .Rmd file as a flexdasboard, in the header use:

output: flexdashboard::flex_dashboard

Each element of the dashboard is delimited with the Markdown third level header: ###.

You can create different columns and rows with:

Column
-------------------------------------
Row
-------------------------------------

flexdashboard example

Seminar

Begin to create a website for your project with RMarkdown and graphics (either static or interactive).

If relevant include:

  • A table of key results

  • A googleVis map

  • A bar or line chart with plotly or other package

  • A simulation plot created with Zelig or other tool showing key results from your analysis.

Push to the gh-pages branch.