robert.weyant@gmail.com

Overview

{magittr} provides 4 special operators

  • %>% - pipe operators
  • %T>% - tee operator
  • %$% - exposition operator
  • %<>% - compound assignment pipe operator

Other Special Operators in R

  • %*% - matrix multiplication, x %*% y
  • %in% - value matching
  • %% - modulus operator
  • %o% - outer product, outer()
  • %x% - Kronecker product, kronecker()
  • %/% - integer division

The Problem

R code can get hard to read


sapply(iris[iris$Sepal.Length < mean(iris$Sepal.Length),-5],FUN = mean)

A (Possible) Solution - the pipe %>%

  • Similar to Unix pipe |
  • Code can be written in the order of execution, left to right
  • %>% will "pipe" information from one statement to the next
  • x %>% f is equivalent to f(x)
  • x %>% f(y) is equivalent to f(x,y)
  • x %>% f %>% g %>% h is equivalent to h(g(f(x)))

What %>% is doing

The %>% is taking the output of the left-hand side and using that for the first argument of the right-hand side, or where it finds a .

Example using head(x, ...)

mtcars %>% head(.,2)  # same as using head(mtcars,2)
##               mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4      21   6  160 110  3.9 2.620 16.46  0  1    4    4
## Mazda RX4 Wag  21   6  160 110  3.9 2.875 17.02  0  1    4    4
mtcars %>% head(2)     # same as using head(mtcars,2)
##               mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4      21   6  160 110  3.9 2.620 16.46  0  1    4    4
## Mazda RX4 Wag  21   6  160 110  3.9 2.875 17.02  0  1    4    4

Slightly more complicated example

library(ggplot2)
mtcars %>%
  xtabs(~gear+carb,data=.) %>% 
  as.data.frame %>% 
  ggplot(.,aes(x=gear,y=carb,size=Freq)) +
  geom_point()

Even more complicated example

# Generate some sample data.
df <-
    data.frame(
        Price    = 1:100 %>% sample(replace = TRUE),
        Quantity = 1:10  %>% sample(replace = TRUE),
        Type     =
            0:1 %>%
            sample(replace = TRUE) %>%
            factor(labels = c("Buy", "Sell"))
    ) 

Source

The combination of %>% with {dplyr}

  • filter()
  • summarize()
  • arrange()
  • mutate()

The combination of %>% with {dplyr}

sapply(iris[iris$Sepal.Length < mean(iris$Sepal.Length),-5],FUN = mean)
## Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
##      5.19875      3.13375      2.46250      0.66375
iris %>% 
  mutate(avg.length=mean(Sepal.Length)) %>% 
  filter(Sepal.Length<avg.length) %>% 
  select(-Species,-avg.length) %>%
  summarise_each(funs(mean))
##   Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1      5.19875     3.13375       2.4625     0.66375

%$% The exposition operator

  • Similar to with() or attach()
  • Useful for functions that don't take a data parameter
  • Can execute several statements by wrapping them in {}

%$% The exposition operator

library(datasets)
table(CO2$Treatment,CO2$Type)
##             
##              Quebec Mississippi
##   nonchilled     21          21
##   chilled        21          21
# with(CO2,table(Treatment,Type))
CO2 %$% table(Treatment,Type)
##             Type
## Treatment    Quebec Mississippi
##   nonchilled     21          21
##   chilled        21          21

%T>% The Tee Operator

  • Allows a "break" in the pipe.
  • Executes right-hand side of %T>%, but will continue to pipe through to next statement

%T>% The Tee Operator

iris %>%
  filter(Species != 'virginica') %>% 
  select(Sepal.Width,Sepal.Length) %T>%
  plot %>%  # Make scatterplot and keep going
  colMeans

##  Sepal.Width Sepal.Length 
##        3.099        5.471

%<>% The Compound Assignment Operator

  • Combines a pipe and an assignment operator
  • Think i++ or x+=z from the C family, Python, Ruby, etc.

%<>% The Compound Assignment Operator

df <- rexp(5,.5) %>% data.frame(col1=.)
df
##        col1
## 1 3.6982916
## 2 0.3231815
## 3 1.0181722
## 4 1.9074081
## 5 1.6484035
df %<>% arrange(col1)
df
##        col1
## 1 0.3231815
## 2 1.0181722
## 3 1.6484035
## 4 1.9074081
## 5 3.6982916

Other things to be aware of

  • Shortcut for %>%: Ctrl+Shift+m
  • ?%>% does not work, use ?'%>%'
  • df %>% .something. %>% sytem.time does not work

Links

Thank You