Slides: [jumpingrivers.com]

Who am I

• Dr Colin Gillespie

Jumping Rivers

• Statistical and R consultancy
• R, Scala, python, & Stan training
• Predictive analytics
• Dashboard development
• Questionnaires

Byte compiler

• The compiler package has been part of R since version 2.13.0
• It translates R functions into another language that can be interpreted by a very fast interpreter
• Since R 2.14.0, all of the standard functions and packages in R will be pre-compiled into byte-code

Byte compiler: the mean() function

#> function (x, ...)
#> UseMethod("mean")
#> <bytecode: 0x73098b8>
#> <environment: namespace:base>

note the bytecode line

Byte compiler

• We can compile our own R functions and obtain byte code version that may run faster.

mean_r = function(x) {
total = 0
n = length(x)
for(i in 1:n)
total = total + x[i]/n
total
}

Compiled version

library("compiler")
cmp_mean_r = cmpfun(mean_r)
cmp_mean_r
#> function(x) {
#>   total = 0
#>   n = length(x)
#>   for(i in 1:n)
#>     total = total + x[i]/n
#>   total
#> }
#> <bytecode: 0x5608fa8>

Benchmarks

# Generate some data
x = rnorm(1000)
microbenchmark::microbenchmark(times = 10, unit = "ms", # milliseconds
mean_r(x), cmp_mean_r(x), mean(x))
#> Unit: milliseconds
#>           expr   min    lq  mean median    uq  max neval cld
#>      mean_r(x) 0.358 0.361 0.370  0.363 0.367 0.43    10   c
#>  cmp_mean_r(x) 0.050 0.051 0.052  0.051 0.051 0.07    10  b
#>        mean(x) 0.005 0.005 0.008  0.007 0.008 0.03    10 a  

Compiling code

There are a number of ways to complile code.

• Compile individual functions using cmpfun()
• Enable just-in-time (JIT) compilation

where $$N$$ indices the level of optimisation ($$0$$ to $$3$$)

Compiling code

• If you create a package, then you automatically compile the package on installation by adding

to the DESCRIPTION file

• Most R packages installed using install.packages() are not compiled
• We can force packages to be compiled by starting R with the environment variable R_COMPILE_PKGS
• Add R_COMPILE_PKGS=3 to ~/.Renviron

Compiling packages

## Windows users need Rtools
install.packages("ggplot2",
type = "source",
INSTALL_opts = "--byte-compile") 

Basic Linear Algebra System (BLAS)

• R uses BLAS for linear algebra operations
• Anything involving matrices
• By switching to a different BLAS library, it may be possible to speed-up your R code.
• Easy for Linux/Apple, but can be tricky for Windows users
• Two open source alternative BLAS libraries are ATLAS and OpenBLAS.

Issues

• ATLAS and OpenBLAS use multiple cores
• Occassionally this can be a problem (if it's embedded in a parallel problem)

A quick break

install.packages("benchmarkme")

benchmarkme

library("benchmarkme")
get_ram()
#> 16.3 GB
get_cpu()
#> $vendor_id #> [1] "GenuineIntel" #> #>$model_name
#> [1] "Intel(R) Core(TM) i7-4702HQ CPU @ 2.20GHz"
#>
#> \$no_of_cores
#> [1] 8

benchmarkme

library("benchmarkme")## On CRAN
## Tests based on a script by
## Simon Urbanek & Douglas Bates
res = benchmark_std(runs = 3)

benchmarkme

library("benchmarkme")
res = benchmark_std(runs = 2)
# # Programming benchmarks (5 tests):
#     3,500,000 Fibonacci numbers calculation (vector calc): 0.52 (sec).
#     Grand common divisors of 1,000,000 pairs (recursion): 0.965 (sec).
#     Creation of a 3500x3500 Hilbert matrix (matrix calc): 0.306 (sec).
#     Creation of a 3000x3000 Toeplitz matrix (loops): 11.5 (sec).
#     Escoufier's method on a 60x60 matrix (mixed): 1.17 (sec).
# # Matrix calculation benchmarks (5 tests):
#    Creation, transp., deformation of a 5000x5000 matrix: 0.794 (sec).
#    2500x2500 normal distributed random matrix ^1000: 0.522 (sec).
#    Sorting of 7,000,000 random values: 0.598 (sec).
#    2500x2500 cross-product matrix (b = a' * a): 6.56 (sec).
#    Linear regr. over a 3000x3000 matrix (c = a \ b'): 4.5 (sec).
# # Matrix function benchmarks (5 tests):

benchmarkme

# Upload results +
# RAM, CPU,
# OS, byte-compile, BLAS
upload_results(res)

benchmarkme

plot(res)

Input/output benchmark_io

• upload_results takes a five column matrix
• Columns 1 to 3: system.time output
• Columns 4 & 5 are benchmark labels
• Results will be automatically incorparated in future benchmarkme releases