Principles of Reactivity

Joe Cheng <joe@rstudio.com>

#ShinyDevConf — January 30, 2016

Welcome!

Warm up: Side effects

Functions can be executed for one or both of the following reasons:

You want its return value.
You want it to have some other effect.

These are (a bit misleadingly) called side effects. Any effect that is not the return value is a side effect.

Functions with side effects

write.csv(...)

plot(cars)

print(x)

httr.POST(...)

alarm()

More side effects

# Sets a variable in a parent environment
value <<- 10

# Loads into global env by default
source("functions.R")

# Modifies the global search list
library(dplyr)

# Only if foo is an env, ref class, or R6
foo$bar <- TRUE

NOT side effects (when inside a function)

# Modifying *local* variables
value <- 10

# Creating most kinds of objects
list(a = 1, b = 2)

# Data frames are pass-by-value in R so this is OK
dataset <- dataset %>% filter(count > 3)

# Most calculations
a + 1
summary(pressure)
lm(speed ~ dist, data = cars)
predict(wfit, interval = "prediction")

Ehhh… Not side effects

# Reading from disk
readLines("data.csv")

# Making HTTP GET requests
httr.GET("https://api.example.com/data.json")

# Reading global variables
.Random.seed

# Modifying the random seed... ehhhhhh...
runif(10)

If executing your function/expression leaves the state of the world a little different than before it executed, it has side effects.

But if “what happens in func, stays in func” (besides the return value), then it doesn’t have side effects.

Side effect quiz

For each function, write Yes if it has side effects, and No if not.

Question 1

function(a, b) {
  (b - a) / a
}

Question 2

function(values) {
  globalenv()$values <- values
  values
}

Question 3

function() {
  options(digits.secs = 6)
  as.character(Sys.time())
}

Question 4

function(df) {
  df$foo <- factor(df$foo)
  df
}

Question 5

function() {
  readLines("~/data/raw.txt")
}

Question 6

function(values) {
  hist(values, plot = TRUE)
}

Question 7

function() {
  # Create temp file, and delete when function exits
  filePath <- tempfile(fileext = ".png")
  on.exit(file.unlink(filePath))

  # Plot to the temp file as PNG image
  png(filePath); plot(cars); dev.off()

  # Return the contents of the temp file
  readBin(filePath, "raw", n = file.info(filePath)$size)
}

Answers

No
Yes
Yes
No
No
Yes
Mostly no

Side effects make code harder to reason about, since order of execution of different side-effecty functions can matter (in non-obvious ways).

But we still need them. Without side effects, our programs are useless! (If a program executes but has no observable interactions with the world, you may as well have not executed it at all!)

Reactive programming

Reactivity can be your best friend—or your worst enemy. If you follow some rules of the road, and trust them, then you’ll end up moving in the right direction.

We haven’t been very upfront about these rules; mostly I’ve disseminated them in replies to shiny-discuss threads. So even if you’ve been following Shiny development pretty closely, it’s quite likely that some of the things I’ll discuss today will be news to you.

One of my top priorities in 2016 is to get the message out there about how to use reactivity properly, and it starts right here, at this conference, in this tutorial. So your feedback is most welcome after the tutorial.

You ignore these principles at your peril! The temptation is especially strong among smart, experienced programmers. Resist it—at least until you’ve tried to do it the right way first. These aren’t rules that people say but don’t expect anyone to completely follow, like “write unit tests for every function”, “floss after every meal”, etc. These are more like, “bring your car to a stop when you come to a stop sign”.

If you’ve tried to do it the right way and still really want to break these rules, email me at joe@rstudio.com and let’s talk about it. But please, do that before sinking weeks or months into your app, while I can still help you!

Ladder of Enlightenment

Made it halfway through the tutorial. Has used output and input.
Made it entirely through the tutorial. Has used reactive expressions (reactive()).
Has used observe() and/or observeEvent(). Has written reactive expressions that depend on other reactive expressions. Has used isolate() properly.
Can say confidently when to use reactive() vs. observe(). Has used invalidateLater.
Writes higher-order reactives (functions that have reactive expressions as input parameters and return values).
Understands that reactive expressions are monads.

I’d like to propose a ladder of Shiny reactivity “enlightenment”.

Take a moment to read this list, then discuss with the people around you where you currently rank. Don’t be shy or embarrassed if you’re at level one or two, we’re all here to learn! Go ahead, I’ll give you two minutes.

How many of you feel like you’re at levels one or two?

How many are at level three?

How many are at level four?

Anyone besides Hadley and Winston at five or six?

So at level three, you can write quite complicated applications. And many of you have. This is a dangerous zone. Your apps generally work, but sometimes you struggle with why things are executing too much, or not enough. Each new feature you add to your app seems to increase the overall complexity superlinearly.

Our goal today is to get everyone, or at least most of you, to level four. When you have a firm grasp on the reactive primitives we’ve built into Shiny, you can build complicated networks of reactive expressions and observers, with confidence. Combine that knowledge with the new modules feature, which Garrett will talk about tomorrow, and you’ve got all the tools you need to write large yet maintainable Shiny apps.

Level five or six is where the real fun begins. We won’t get there today, but if you’re interested in learning more, please let me know! I’d love to talk to you. Maybe we can organize a group vchat or webinar or something, and eventually spin that in to an article or three.

Exercise 0

Open Exercise_00.R and complete the server function. Make the plot output show a simple plot of the first nrows rows of a built-in dataset.

You have 3 minutes!

Hint: plot(head(cars, nrows))

We’ll get started with a really basic example app, just to get the juices flowing a little bit.

Open up Exercise_00.R; it should be in your Files pane. You should see the beginnings of a Shiny app. The UI definition is complete, but the server function is blank. I want you to fill in that server function. Make the plot output show a simple plot of the first nrows rows of a built-in dataset of your choice. If you can’t think of any, use cars.

So basically, make the Shiny equivalent of this: plot(head(cars, nrows))

I’ll give you five minutes. That might be way too much time for some of you, but it’ll give us a chance to shake out any technical issues. If you need help, talk to your neighbors, or flag down one of the TAs or myself. If you have extra time, get to know your neighbors a little more.

Solution

output$plot <- renderPlot({
  plot(head(cars, input$nrows))
})

Anti-solution

observe({
  df <- head(cars, input$nrows)
  output$plot <- renderPlot(plot(df))
})

output$plot1 <- renderPlot(...)

DOESN’T mean: “Go update the output "plot1" with the result of this code.”
DOES mean: “This code is the recipe that should be used to update the output "plot1".”

Historically, we’ve asked you to take it on faith that whenever input$nrows changes, any dependent outputs, reactive expressions, and observers will do the right thing. But how does Shiny know how the code is related? How does it know which outputs depend on which inputs, reactives, etc.?

There are really two possibilities: static analysis, where we’d examine your code, looking for reactive-looking things; and runtime analysis, where we’d execute your code and see what happens.

We do the latter. Shiny just executes your code and sees what happens. It eavesdrops to see what reactive values (like input) or reactive expressions your output reads, and whatever it reads is considered a “dependency”. Any changes to one of those dependencies means the output is considered out-of-date, or “invalidated”, and might need to be re-executed.

Takeaway

Know the difference between telling Shiny to do something, and telling Shiny how to do something.

Reactive expressions

Expressions that are reactive (obviously)

Expression: Code that produces a value
Reactive: Detects changes in anything reactive it reads

function(input, output, session) {
  # When input$min_size or input$max_size change, large_diamonds
  # will be notified about it.
  large_diamonds <- reactive({
    diamonds %>%
      filter(carat >= input$min_size) %>%
      filter(carat < input$max_size)
  })
  
  # If that happens, large_diamonds will notify output$table.
  output$table <- renderTable({
    large_diamonds() %>% select(carat, price)
  })

  ... continued ...

  # Reactive expressions can use other reactive expressions.
  mean_price <- reactive({
    mean(large_diamonds()$price)
  })
  
  # large_diamonds and mean_price will both notify output$message
  # of changes they detect.
  output$message <- renderText({
    paste0(nrow(large_diamonds()), " diamonds in that range, ",
      "with an average price of $", mean_price())
  })
}

function(input, output, session) {
  
  # This DOESN'T work.
  large_diamonds <- diamonds %>%
    filter(carat >= input$min_size) %>%
    filter(carat < input$max_size)
  
  output$table <- renderTable({
    large_diamonds %>% select(carat, price)
  })
}

large_diamonds would only be calculated once, as the session starts (i.e. as the page first loads in a browser).

Exercise 1

Open up the file Exercise_01.R.

There’s a new tableOutput("table") in ui.R. Have it show the same data frame that is being plotted, using renderTable.

Make sure that the head() operation isn’t performed more than once for each change to input$nrows.

You have 5 minutes.

Solution

function(input, output, session) {

  df <- reactive({
    head(cars, input$nrows)
  })
  
  output$plot <- renderPlot({
    plot(df())
  })
  
  output$table <- renderTable({
    df()
  })
}

Anti-solution 1

function(input, output, session) {

  values <- reactiveValues(df = cars)
  observe({
    values$df <- head(cars, input$nrows)
  })
  
  output$plot <- renderPlot({
    plot(values$df)
  })
  
  output$table <- renderTable({
    values$df
  })
}

Anti-solution 2

function(input, output, session) {

  df <- cars
  observe({
    df <<- head(cars, input$nrows)
  })
  
  output$plot <- renderPlot({
    plot(df)
  })
  
  output$table <- renderTable({
    df
  })
}

Let’s forget about that last one, since it doesn’t work. What about the previous two? Let’s talk about what they do. The first one uses a reactive expression to store the calculation. The second one creates a reactive values object and uses an observer to keep the value up-to-date. Who prefers the first approach? Who prefers the second?

So we mostly agree that the first approach is superior. But why? It might feel like I’m just setting up strawmen, but I see this kind of code all the time on the shiny-discuss mailing list. It seems obvious when we lay it bare with a minimal example like this, but in the context of a more complicated app, it can be much trickier.

We shouldn’t take the second approach—but why shouldn’t we take it? What’s the first-principles reason to avoid this kind of code? We need some first-principles to build from so we can confidently answer these questions. You should be able to confidently answer these questions by the end of the tutorial.

Takeaway

Prefer using reactive expressions to model calculations, over using observers to set (reactive) variables.

Exercise 2

Open up the file Exercise_02.R.

This is a working app–you can go ahead and run it. You choose variables from the iris (yawn) data set, and on various tabs it shows information about the selected variables and fits a linear model.

The problem right now, is that each of the four outputs contains copied-and-pasted logic for selecting out your chosen variables, and for building the model. Can you refactor the code so it’s more maintainable and efficient?

You have 5 minutes.

Solution

selected <- reactive({
  iris[, c(input$xcol, input$ycol)]
})

model <- reactive({
  lm(paste(input$ycol, "~", input$xcol), selected())
})

Here’s what we’ve got: two reactive expressions, one of which depends on the other (model calls selected()).

This is the cool thing about reactive expressions: they compose.

(show diagram)

If you think of Shiny apps as network graphs, then reactive values (like inputs) form one kind of leaf node; outputs and observers form another kind of leaf node; and reactive expressions are the nodes in the middle that can form arbitrarily deep links.

Ctrl-F3 Once we have this beautiful (ok, not that beautiful, we’re going to work on it…) graph, Shiny can use it to optimize its calculations. It’s a little-known fact that Shiny outputs generally “know” when they’re not visible on the page, and suspend themselves. When they do that, they no longer cause reactive expressions to execute (because reactive expressions are lazy).

In this case, we don’t have to perform the model fitting unless and until a different tab is selected.

Anti-solution

  # Don't do this!
  
  # Introduce reactive value for each calculated value
  values <- reactiveValues(selected = NULL, model = NULL)
  
  # Use observers to keep the values up-to-date
  observe({
    values$selected <- iris[, c(input$xcol, input$ycol)]
  })
  
  observe({
    values$model <- lm(paste(input$ycol, "~", input$xcol), values$selected)
  })

Takeaway

Seriously, prefer using reactive expressions to model calculations, over using observers to set (reactive) variables.

Observers

Observers are blocks of code that perform actions.

They’re executed in response to changing reactive values/expressions.

They don’t return a value.

observe({
  cat("The value of input$x is now ", input$x, "\n")
})

Observers come in two flavors

Implicit: Depend on all reactive values/expressions encountered during execution.
observe({...})
Explicit: Just depend on specific reactive value/expression; ignore all others. (Also known as “event handler”.)
observeEvent(eventExpr, {...})

function(input, output, session) {

  # Executes immediately, and repeats whenever input$x changes.
  observe({
    cat("The value of input$x is now ", input$x, "\n")
  })
  
  # Only executes when input$upload_button is pushed. Any reactive
  # values/expressions encountered in the code block are treated
  # as non-reactive values/expressions.
  observeEvent(input$upload_button, {
    httr::POST(server_url, jsonlite::toJSON(dataset()))
  })
}

Exercise 3

Open Exercise_03.R.

Add server logic so that when the input$save button is pressed, the data is saved to a CSV file called "data.csv" in the current directory.

You have 5 minutes!

Solution

# Use observeEvent to tell Shiny what action to take
# when input$save is clicked.
observeEvent(input$save, {
  write.csv(df(), "data.csv")
})

Reactive expressions vs. observers

`reactive()`

It can be called and returns a value, like a function. Either the last expression, or return().
It’s lazy. It doesn’t execute its code until somebody calls it (even if its reactive dependencies have changed). Also like a function.
It’s cached. The first time it’s called, it executes the code and saves the resulting value. Subsequent calls can skip the execution and just return the value.
It’s reactive. It is notified when its dependencies change. When that happens, it clears its cache and notifies it dependents.

function(input, output, session) {
  reactive({
    # This code will never execute!
    cat("The value of input$x is now ", input$x, "\n")
  })
}

r1 <- function() { runif(1) }
r1()
# [1] 0.8403573
r1()
# [1] 0.4590713
r1()
# [1] 0.9816089

r2 <- reactive({ runif(1) })
r2()
# [1] 0.5327107
r2()
# [1] 0.5327107
r2()
# [1] 0.5327107

The fact that reactive expressions are lazy and cached, is critical.

It’s hard to reason about when reactive expressions will execute their code—or whether they will be executed at all.

All Shiny guarantees is that when you ask a reactive expression for an answer, you get an up-to-date one.

`observe()` / `observeEvent()`

It can’t be called and doesn’t return a value. The value of the last expression will be thrown away, as will values passed to return().
It’s eager. When its dependencies change, it executes right away.
(Since it can’t be called and doesn’t have a return value, there’s no notion of caching that applies here.)
It’s reactive. It is notified when its dependencies change, and when that happens it executes (not right at that instant, but ASAP).

`reactive()`	`observe()`
Callable	Not callable
Returns a value	No return value
Lazy	Eager
Cached	N/A

reactive() is for calculating values, without side effects.
observe() is for performing actions, with side effects.

A calculation is a block of code where you don’t care about whether the code actually executes—you just want the answer. Safe for caching. Use reactive().

An action is where you care very much that the code executes, and there is no answer (return value), only side effects. Use observe()/observeEvent().

(What if you want both an answer AND you want the code to execute? Refactor into two code chunks–separate the calculation from the action.)

	`reactive()`	`observe()`
Purpose	Calculations	Actions
Side effects?	Forbidden	Allowed

An easy way to remember

Keep your side effects
Outside of your reactives
Or I will kill you

—Joe Cheng

Takeaway

Use reactive expressions for calculations (no side effects). Use observers for actions (side effects).

Reactive values

A reactiveValues object is like an environment object or a named list: it stores name/value pairs. You get and set values using $ or [[name]].

rv <- reactiveValues(a = 10)

rv$a
# [1] 10

rv$a <- 20

rv[["a"]]
# [1] 20

input is one (read-only) example.

Exercise 4

Open file Exercise_04.R.

Modify the server function so that when the “rnorm” button is clicked, the plot shows a new batch of rnorm(100) values. When “runif” button is clicked, the plot should show a new batch of runif(100).

You have 15 minutes this time.

Solution

function(input, output, session) {
  v <- reactiveValues(data = runif(100))
  
  observeEvent(input$runif, {
    v$data <- runif(100)
  })
  
  observeEvent(input$rnorm, {
    v$data <- rnorm(100)
  })  
  
  output$plot <- renderPlot({
    hist(v$data)
  })
}

We’ve identified a number of cases where we should use a reactive expression instead of an observe(Event)/reactiveValues pairing. But there are cases where you simply must use the latter.

There are essentially cases where inputs, outputs, and reactive expressions aren’t powerful enough to natively express the computations you want to perform. So you have the “escape hatch” of observe/reactiveValues; you can do things that would otherwise be impossible, at the price of your code being harder to reason about and harder for the reactive framework to help you with.

Accumulating values over time, not just reacting to the latest one
Aggregating multiple reactive values/expressions into a single expression
Adding artificial latency into reactive values/expressions

In general, we want to stick to reactive expressions whenever possible. And when we really need to, break out the big guns of observe(Event)/reactiveValues.

Takeaway

When necessary, you can use observers and reactive values together to escape the usual limits of reactivity.

Exercise 5: Challenge!

Exercise_05.R contains a broken application. See if you can figure out how to fix it!

Read the comments in the file for more details.

Takeaways

Know the difference between telling Shiny to do something, and telling Shiny how to do something.
Prefer using reactive expressions to model calculations, over using observers to set (reactive) variables.
Seriously, prefer using reactive expressions to model calculations, over using observers to set (reactive) variables.
Use reactive expressions for calculations (no side effects). Use observers for actions (side effects).
When necessary, you can use observers and reactive values together to escape the usual limits of reactivity.

Principles of Reactivity

Joe Cheng <joe@rstudio.com>

#ShinyDevConf — January 30, 2016

Welcome!

Warm up: Side effects

Functions with side effects

More side effects

NOT side effects (when inside a function)

Ehhh… Not side effects

Side effect quiz

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Answers

Reactive programming

Ladder of Enlightenment

Exercise 0

Solution

Anti-solution

Takeaway

Know the difference between telling Shiny to do something, and telling Shiny how to do something.

Reactive expressions

Exercise 1

Solution

Anti-solution 1

Anti-solution 2

Takeaway

Prefer using reactive expressions to model calculations, over using observers to set (reactive) variables.

Exercise 2

Solution

Anti-solution

Takeaway

Seriously, prefer using reactive expressions to model calculations, over using observers to set (reactive) variables.

Observers

Observers come in two flavors

Exercise 3

Solution

Reactive expressions vs. observers

reactive()

observe() / observeEvent()

An easy way to remember

Takeaway

Use reactive expressions for calculations (no side effects). Use observers for actions (side effects).

Reactive values

Exercise 4

Solution

Takeaway

When necessary, you can use observers and reactive values together to escape the usual limits of reactivity.

Exercise 5: Challenge!

Takeaways

Other topics

`reactive()`

`observe()` / `observeEvent()`