My R code is slow

Slides: [jumpingrivers.com]

Who am I

Dr Colin Gillespie
- Originally from the 1990 City of Culture (Glasgow)
- Senior Statistics Lecturer, Newcastle University
- Consultant at Jumping Rivers
- twitter: csgillespie
- linkedin

Jumping Rivers

Statistical and R consultancy
R, Scala, python, & Stan training
Predictive analytics
Dashboard development
Questionnaires

My R code is slow

Use the byte compiler!

Byte compiler

The compiler package has been part of R since version 2.13.0
- It translates R functions into another language that can be interpreted by a very fast interpreter
Since R 2.14.0, all of the standard functions and packages in R will be pre-compiled into byte-code

Byte compiler: the mean() function

#> function (x, ...) 
#> UseMethod("mean")
#> <bytecode: 0x73098b8>
#> <environment: namespace:base>

note the bytecode line

Byte compiler

We can compile our own R functions and obtain byte code version that may run faster.

Example: Bad mean

mean_r = function(x) {
  total = 0
  n = length(x)
  for(i in 1:n)
    total = total + x[i]/n
  total
}

Compiled version

library("compiler")
cmp_mean_r = cmpfun(mean_r)
cmp_mean_r  
#> function(x) {
#>   total = 0
#>   n = length(x)
#>   for(i in 1:n)
#>     total = total + x[i]/n
#>   total
#> }
#> <bytecode: 0x5608fa8>

Benchmarks

# Generate some data
x = rnorm(1000)
microbenchmark::microbenchmark(times = 10, unit = "ms", # milliseconds
          mean_r(x), cmp_mean_r(x), mean(x))
#> Unit: milliseconds
#>           expr   min    lq  mean median    uq  max neval cld
#>      mean_r(x) 0.358 0.361 0.370  0.363 0.367 0.43    10   c
#>  cmp_mean_r(x) 0.050 0.051 0.052  0.051 0.051 0.07    10  b 
#>        mean(x) 0.005 0.005 0.008  0.007 0.008 0.03    10 a

Benchmarks

Compiling code

There are a number of ways to complile code.

Compile individual functions using cmpfun()
Enable just-in-time (JIT) compilation
- At the top of your R code add

where \(N\) indices the level of optimisation (\(0\) to \(3\))

Compiling code

If you create a package, then you automatically compile the package on installation by adding

to the DESCRIPTION file

Most R packages installed using install.packages() are not compiled
- We can force packages to be compiled by starting R with the environment variable R_COMPILE_PKGS
- Add R_COMPILE_PKGS=3 to ~/.Renviron

Compiling packages

## Windows users need Rtools
install.packages("ggplot2", 
                 type = "source", 
                 INSTALL_opts = "--byte-compile")

My R code is slow

Change your BLAS library

Basic Linear Algebra System (BLAS)

R uses BLAS for linear algebra operations
- Anything involving matrices
By switching to a different BLAS library, it may be possible to speed-up your R code.
- Easy for Linux/Apple, but can be tricky for Windows users
Two open source alternative BLAS libraries are ATLAS and OpenBLAS.

Issues

ATLAS and OpenBLAS use multiple cores
- Occassionally this can be a problem (if it's embedded in a parallel problem)

A quick break

install.packages("benchmarkme")

My R code is slow

Buy a better computer!

Or should you?

`benchmarkme`

library("benchmarkme")

get_ram()
#> 16.3 GB
get_cpu()
#> $vendor_id
#> [1] "GenuineIntel"
#> 
#> $model_name
#> [1] "Intel(R) Core(TM) i7-4702HQ CPU @ 2.20GHz"
#> 
#> $no_of_cores
#> [1] 8

`benchmarkme`

library("benchmarkme")## On CRAN
## Tests based on a script by
## Simon Urbanek & Douglas Bates
res = benchmark_std(runs = 3)

`benchmarkme`

library("benchmarkme")
res = benchmark_std(runs = 2)
# # Programming benchmarks (5 tests):
#     3,500,000 Fibonacci numbers calculation (vector calc): 0.52 (sec).
#     Grand common divisors of 1,000,000 pairs (recursion): 0.965 (sec).
#     Creation of a 3500x3500 Hilbert matrix (matrix calc): 0.306 (sec).
#     Creation of a 3000x3000 Toeplitz matrix (loops): 11.5 (sec).
#     Escoufier's method on a 60x60 matrix (mixed): 1.17 (sec).
# # Matrix calculation benchmarks (5 tests):
#    Creation, transp., deformation of a 5000x5000 matrix: 0.794 (sec).
#    2500x2500 normal distributed random matrix ^1000: 0.522 (sec).
#    Sorting of 7,000,000 random values: 0.598 (sec).
#    2500x2500 cross-product matrix (b = a' * a): 6.56 (sec).
#    Linear regr. over a 3000x3000 matrix (c = a \ b'): 4.5 (sec).
# # Matrix function benchmarks (5 tests):

`benchmarkme`

# Upload results +
# RAM, CPU, 
# OS, byte-compile, BLAS
upload_results(res)

`benchmarkme`

plot(res)

Uploaded results

Hardware: RAM

Results: Programming benchmarks

And the winner is….

Results: Matrix benchmarks

Intel CPU Differences (relative times)

Input/output `benchmark_io`

Adding benchmarkme to your package

upload_results takes a five column matrix
- Columns 1 to 3: system.time output
- Columns 4 & 5 are benchmark labels
Easy to add to your own package
- Results will be automatically incorparated in future benchmarkme releases

Summary

Upgrade hardware
Byte-compiling and BLAS are easy optimisations
- No-one byte compiles!
Network drives are slow

Who am I

Jumping Rivers

My R code is slow

Use the byte compiler!

Byte compiler

Byte compiler: the mean() function

Byte compiler

Example: Bad mean

Compiled version

Benchmarks

Benchmarks

Compiling code

Compiling code

Compiling packages

My R code is slow

Change your BLAS library

Basic Linear Algebra System (BLAS)

Issues

A quick break

My R code is slow

Buy a better computer!

Or should you?

benchmarkme

benchmarkme

benchmarkme

benchmarkme

benchmarkme

Uploaded results

Hardware: RAM

Results: Programming benchmarks

And the winner is….

Results: Matrix benchmarks

Intel CPU Differences (relative times)

Input/output benchmark_io

Adding benchmarkme to your package

Summary

Links

`benchmarkme`

`benchmarkme`

`benchmarkme`

`benchmarkme`

`benchmarkme`

Input/output `benchmark_io`