Speed
not always worth speeding up, but sometimes makes a huge difference
standard routines are general purpose, one size doesnt fit all?
Flexibility
- reuse existing routines
Just for Fun?
- What is wrong with me?
11 November, 2015
Speed
not always worth speeding up, but sometimes makes a huge difference
standard routines are general purpose, one size doesnt fit all?
Flexibility
Just for Fun?
Rcpp interface is clean, easy to use
if you can write R, you can write Rcpp
please Note: GPL applies to your code!
rJava is Literally Just Writing R
Packaging System in R Makes This Stuff Easy and Fun
convention over configuration
Example Code
#include <R.h> #include <Rinternals.h> SEXP add(SEXP a, SEXP b) { SEXP result = PROTECT(allocVector(REALSXP, 1)); REAL(result)[0] = asReal(a) + asReal(b); UNPROTECT(1); return result; }
add <- function(a, b) { .Call("add", a, b) }
R garbage collected
C interface talks to a SEXP, which is actually different objects for each type
require(pryr)
## Loading required package: pryr
inspect(c(1,2,3))
## <REALSXP 0xceeaf8>
Lots of code to negotiate, coerce, etc.
error prone, boilerplate stuff!
and this is a simple case…
Very natural C++ interface to R
uses a type registry and some "sugar" to encapsulate and provide better experience to dev
c++ operator overloading
improved interface hides type conversion, boilerplate code
sugar access to R-like functions
probably reduces mistakes
Time domain valid region convolution: 1D filtering as the dot product over time
\[(s*k)=\sum_{n=0}^N \sum_{m=0}^M s[n + m]k[M - m]\]
This means that 2 signals of length N, we have N multiplies and N-1 adds for each n, performed N times
\[O(N^2)\]
Filtering in the Frequency Domain
\[O(N \log N)\]
convolution in time domain means multiplication in the frequency domain
now IFFT the product to get back to time domain
First two images, aligned
typically done with cross-correlation of micro-patches
followed by warping
Many times, signal length T, kernel length K, shorter than T
Fastest Execution Depends
now we have T * K multiplies, T * (K - 1) adds in time domain
frequency domain, need 2 signals length N = T + K - 1, zeropadded
Affected by
lengths of signals (changes number of flops)
memory allocation and movement
convolve(signalA, signalB, type='filter')
implemented (always) as an FFT
also, what about the R itself, does it pay for flexibility?
inline directly with cfunction()
getting your hands dirty easy but also cumbersome
sourceCpp()
source an entire C++ file
compileAttributes(pathToPackage)
turns "sugar" into gold!
Rcpp.package.skeleton(name = "anRpackage", list = character(), environment = .GlobalEnv, path = ".", force = FALSE, code_files = character(), cpp_files = character(), example_code = TRUE, attributes = TRUE, module = FALSE, author = "Who wrote it", maintainer = if(missing( author)) "Who to complain to" else author, email = "yourfault@somewhere.net", license = "What Licence is it under ?" )
Generate a package
tweak meta-data, code
call compileAttributes()
r require(Rcpp) compileAttributes('./convotron')
//' Valid region cross-correlation of a vector with another vector. //' This is equivalent to convolve(x, y, type='filter') //' //' @param x first input signal (length(x) >= length(y)) //' @param y second input signal (length(y) < length(x)) //' @return a vector that is of width length(x) - length(y) + 1 // [[Rcpp::export]] NumericVector ctron_xcorr1(const NumericVector& x, const NumericVector& y) { int iT = x.size(); int fT = y.size(); // valid-region cross-correlation output NumericVector z(iT - fT + 1, 0.); int oT = z.size(); for (int i = 0; i < oT; ++i) { for (int j = 0; j < fT; ++j) { z[i] += x[i + j] * y[j]; } } return z; }
//' Valid region cross-correlation of a vector with another vector using fftw. //' This is equivalent to convolve(x, y, type='filter') when corr=TRUE //' //' @param x first input signal (length(x) >= length(y)) //' @param y second input signal (length(y) < length(x)) //' @return a vector that is of width length(x) - length(y) + 1 // [[Rcpp::export]] NumericVector ctron_fftfilt1(const NumericVector& x, const NumericVector& y, bool corr) { int xsz = x.size(); int ysz = y.size(); // fft size int wide = nextPowerOf2(xsz + ysz - 1); // valid-region convolution/cross-correlation size int narrow = xsz - ysz + 1; NumericVector z(narrow); fftw_plan p; // Buffers for FFTs fftw_complex* xwide = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * wide); fftw_complex* ywide = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * wide);
int mn = std::min<int>(xsz, ysz); // zero out xwide, ywide ... // copy x into a complex array for (int i = 0; i < xsz; ++i) { ... } // copy y into a complex array forward or backward? if (corr) { ... } else { ... } fftfilt1(wide, p, xwide, ywide, corr); // cleanup ...
// 1D FFT filtering void fftfilt1(int wide, fftw_plan& p, fftw_complex* xwide, fftw_complex* ywide, bool corr) { // FFTs p = fftw_plan_dft_1d(wide, xwide, xwide, FFTW_FORWARD, FFTW_ESTIMATE); fftw_execute(p); fftw_execute_dft(p, ywide, ywide); // conj, followed by complex multiply for (int i = 0; i < wide; i++) { ywide[i][1] = -ywide[i][1]; double xwr = xwide[i][0]; double xwi = xwide[i][1]; xwide[i][0] = xwr * ywide[i][0] - xwi * ywide[i][1]; xwide[i][1] = xwr * ywide[i][1] + xwi * ywide[i][0]; } // IFFT p = fftw_plan_dft_1d(wide, xwide, xwide, FFTW_BACKWARD, FFTW_ESTIMATE); fftw_execute(p); }
Generates wrappers
// This file was generated by Rcpp::compileAttributes // Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393 #include <Rcpp.h> using namespace Rcpp; // ctron_xcorr1 NumericVector ctron_xcorr1(const NumericVector& x, const NumericVector& y); RcppExport SEXP convotron_ctron_xcorr1(SEXP xSEXP, SEXP ySEXP) { BEGIN_RCPP Rcpp::RObject __result; Rcpp::RNGScope __rngScope; Rcpp::traits::input_parameter< const NumericVector& >::type x(xSEXP); Rcpp::traits::input_parameter< const NumericVector& >::type y(ySEXP); __result = Rcpp::wrap(ctron_xcorr1(x, y)); return __result; END_RCPP } // ctron_conv1 ...
Using external libs?
here fftw3, so create Makevars, Makevars.WIN
now usual suspects
A length 3000, B length 300
> A = matrix(runif(3000)*2-1, 60, 50) > B = matrix(runif(300)*2-1, 60, 5) > benchmark(loop.convR(A,B), loop.xcorr1(A,B), loop.fftfilt1(A,B), native.xcorr1(A,B), native.fftfilt1(A,B), replications=1000) test replications elapsed relative user.self sys.self user.child sys.child 1 loop.convR(A, B) 1000 1.655 66.20 1.652 0.000 0 0 3 loop.fftfilt1(A, B) 1000 2.166 86.64 2.136 0.024 0 0 2 loop.xcorr1(A, B) 1000 0.297 11.88 0.296 0.000 0 0 5 native.fftfilt1(A, B) 1000 1.772 70.88 1.740 0.028 0 0 4 native.xcorr1(A, B) 1000 0.025 1.00 0.024 0.000 0 0
Native cross-correlation wins
A length 30k, B length 5k
> A = matrix(runif(30000)*2-1, 6, 5000) > B = matrix(runif(300)*2-1, 6, 50) > benchmark(loop.convR(A,B), loop.xcorr1(A,B), loop.fftfilt1(A,B), native.xcorr1(A,B), native.fftfilt1(A,B), replications=1000) test replications elapsed relative user.self sys.self user.child sys.child 1 loop.convR(A, B) 1000 4.245 3.324 4.236 0.000 0 0 3 loop.fftfilt1(A, B) 1000 2.067 1.619 2.060 0.004 0 0 2 loop.xcorr1(A, B) 1000 2.037 1.595 2.036 0.000 0 0 5 native.fftfilt1(A, B) 1000 1.277 1.000 1.272 0.000 0 0 4 native.xcorr1(A, B) 1000 1.405 1.100 1.376 0.024 0 0
Native FFT wins
2D convolution with image A, size 60x50, kernel 5x6
> A = matrix(runif(3000)*2-1, 60, 50) > B = matrix(runif(30)*2-1, 5, 6) > benchmark(conv2(A,B), ctron_conv2mx(A, B), replications=1000) test replications elapsed relative user.self sys.self user.child sys.child 1 conv2(A, B) 1000 0.340 4.928 0.340 0 0 0 2 ctron_conv2mx(A, B) 1000 0.069 1.000 0.068 0 0 0
Native time domain 2D conv wins
Search System
Inverted index, makes searching for text fast
Constant time lookup from word to documents
If you are scanning, its going to be linear traversal of all docs
Also provides powerful, expressive grammars
So I thought…
Achtung! Lucene's API is a bit thorny
General Flow for Indexing
Create an Analyzer (for tokenization of your doc)
Create an IndexWriter
For each column in our frame we want to index
General Flow for Searching
Create an IndexSearcher and Query
Get results back and "hydrate"
Package or stand-alone
I recommend package!
Need Java installed first
Grab your dependency jars
dpressel@dpressel:~/dev/work/lucifR$ ls -R .: DESCRIPTION inst NAMESPACE R README.md ./inst: java ./inst/java: lucene-analyzers-common-5.3.1.jar lucene-core-5.3.1.jar lucene-queryparser-5.3.1.jar ./R: lucifR.index.R lucifR.newAnalyzer.R lucifR.newSearcher.R lucifR.search.R onLoad.R lucifR-internal.R lucifR.newQuery.R lucifR.queryParser.R lucifR.tokr.R
onLoad <- function(libname, pkgname) { .jpackage(pkgname, lib.loc=libname) }
Finds your inst/java directory, add classpath dependencies automatically
To do lucene stuff, we need an analyzer
# Default our argument to the path style name for the Java class lucifR.newAnalyzer <- function(byName="org/apache/lucene/analysis/standard/StandardAnalyzer") { # J accesses the class StandardAnalyzer = J(byName); # new(X) allocates an X analyzer = new(StandardAnalyzer); }
lucifR.tokr <- function(an, str) { CharTermAttribute = J("org/apache/lucene/analysis/tokenattributes/CharTermAttribute"); # We call class methods like this! ts = an$tokenStream("dontcare", str); charTermAttribute = ts$addAttribute(CharTermAttribute$class); ts$reset(); toks = c(); # Traverse tokens while (ts$incrementToken()) { # And get the values out and put them in a list rstr = as.character(charTermAttribute$toString()) toks = c(toks, rstr); } ts$end(); ts$close(); return(toks) }
require(lucifR)
## Loading required package: lucifR ## Loading required package: rJava
an = lucifR.newAnalyzer() lucifR.tokr(an, "I like cats. They are pretty awesome!")
## [1] "i" "like" "cats" "pretty" "awesome"
Right??
Ok, on to the hard stuff!
Notice hard part is navigating Lucene, not rJava!
lucifR.index <- function(aframe, dirname, fnames=names(F), cb=NULL) { ... # Just to show how to use .jnew instead of new(X) jfs = .jnew("java/io/File", dirname) jpath = jfs$toPath() iwc = new(IndexWriterConfig, analyzer) iwc$setOpenMode(IndexWriterConfig$OpenMode$CREATE); jdir = FSDirectory$open(jpath); writer = new(IndexWriter, jdir, iwc); # Go through all rows in a frame indexing any requested fname (column) invisible(sapply(1:nrow(aframe), function(i) { doc = new(Document); for (name in fnames) { val = aframe[i,][[name]] tf = new(TextField, name, as.character(val), Field$Store$YES); doc$add(tf); } doc$add(new(IntField, "docid", i, Field$Store$YES)); writer$addDocument(doc); ... })); writer$close(); }
lucifR.search <- function(aframe, searcher, q, limit=10) { # Rjava will need this to be the base class not derived, otherwise the method wont be found qq = .jcast(q, new.class="org/apache/lucene/search/Query", check=TRUE) # Explicitly give the sig of this method, passing cast query and explicitly coerce limit to int top = .jcall(searcher, "Lorg/apache/lucene/search/TopDocs;", "search", qq, as.integer(limit)) hits = top$scoreDocs # Initialize an empty table with same names as aframe, but also with a score df = read.table(text="", col.names=c(names(aframe), "score")) for (hit in hits) { doc = searcher$doc(hit$doc); # Get the row back in the frame -- we stored this in the index docid = as.integer(doc$get("docid")); # bind the score to the aframe row, and append that into df df <- rbind(df, cbind(aframe[docid,], hit$score)) } return (df) }
require(lucifR) p = 'Apple-Twitter-Sentiment-DFE.csv' F = read.csv(p) lucifR.index(F, 'apple-idx', c('text')) searcher = lucifR.newSearcher('apple-idx') qp = lucifR.queryParser(c('text'), lucifR.newAnalyzer()) q = lucifR.newQuery(qp, 'garbage OR crap OR "stopped working"') hits = lucifR.search(F, searcher, q) print(hits$text[1:5])
## [1] @SuzanneBoyd @Apple it stopped working? I would be screaming. I'm getting mine 12/24. ## [2] Happy Monday! My camera on my fancy @Apple #iPhone6Plus suddenly stopped working this weekend, so instead - I meme. http://t.co/ktImoxWESr ## [3] Happy Monday! My camera on my fancy @Apple #iPhone6Plus suddenly stopped working this weekend, so instead - I meme. http://t.co/XdySgcxUuv ## [4] Happy Monday! My camera on my fancy @Apple #iPhone6Plus suddenly stopped working this weekend, so instead - I meme. http://t.co/Mb067jv5SD ## [5] Happy Monday! My camera on my fancy @Apple #iPhone6Plus suddenly stopped working this weekend, so instead - I meme. http://t.co/lp9KPujmPg ## 3217 Levels: 0144 : #Sick : Being sick has a perk. Catching up on a million shows with @apple TV. ...
In the code, I demonstrate the more low level API and reflection
Reflected code is almost like Java!
Reflected methods are slower avoid where speed is a concern
Sometimes, reflection can goof up giving us the wrong thing in the sig
R is "Good at" Accessing Other Languages
Good Reasons to Access Other Languages
Speed
Flexibility
Boredom