“The most exciting phrase to hear in science, the one that heralds the most discoveries, is not ‘Eureka!’ (I found it!) but ‘That’s funny…’” - Isaac Asimov
As your eyes get tired, you will subconsciously avoid reading details in code that might contain errors.
Code can be thought of as a language, literally. Programming languages can have vocabulary, syntax, grammar, pragmatics, etymology, cultural conventions, and even word roots (prefixes and suffixes) etc. just like normal languages.
It can be helpful to interpret programming semantics as their analogous counterparts in language.
Programming | Language |
---|---|
Scripts | Essays |
Sections | Paragraphs |
Lines Breaks | Sentences |
Parentheses | Punctuation |
Functions | Verbs |
Classes | Adjectives |
Variables | Nouns |
Even your variable names can contain attributes similar to affixes and word roots.
Prefixes | Suffixes |
---|---|
Nic-, Ni : (Irish, Scottish) “daughter of” | -datter (Danish, Norwegian) “daughter (of)” |
M’, Mac, Mc, Mhic, Mic : (Irish, Scottish and Manx Gaelic) “son” | -son (English, Swedish, German, Norwegian, Icelandic) “son (of)” |
Every bit of forethought that goes into writing a persuasive essay can go into a script. Your code should endeavor to be an explanation as well as instructions.
The easier your code can be read, the easier it will be to debug.
Some things you should consider:
It can be really important to have foresight as to how the code you are writing can be utilized. One of the confusing things about R is it is often used one time in a semi-disposable fashion.
This results in most R users not being accustomed to writing code using more modular, adjustable, extendable, robust, and adaptable methods.
Typical naming conventions in R suggest avoiding periods (unless assigning S3 methods), as well as using underscores and/or capitals to visually parse object name components. Considering this, it is rarely discussed how one might go about naming objects in the first place.
“There are only two hard things in Computer Science: cache invalidation and naming things.” - Phil Karlton
One method I sometimes utilize is to consider three components for every object:
An alternative might be:
CONST_leapyear_ivec
/ CONST_leapy_ivec
/ CST_leapy_vec
/ CST_lpy2k_v
/ CST_lpy_v
/ const_leapy_vec
(local) / const_leapy
(local) / etc.output_results_dt
/ rslt_rbl_dt
(using rbindlist) / rslt_featcols_dt
/ loop_results_dt
/ rslt_iterstore_dt
/ etc.subset_lefttoproc_fvec
/ reductionjoin_leftover_fac
/ ljoin_remaining_Jfac
/ red_lftover_fac
/ ss_lover_fac
/ etc.red_lov_f
).allcond_fullset_lm
(all conditions, all outliers) / fit_allcon_lm
/ allcon_wo_ol_lm
/ majcon_fs_lm
(only major conditions) / etc.Use Ctrl+Shift+A to reformat code or Ctrl+I to auto indent code
Shorten page width to 80 characters (important).
By breaking instructions down into smaller pieces you make code more understandable and easier to remember (called chunking). No one understands a run-on sentence, so why should anyone understand a run-on instruction? Having concise and organized lines of code can make a world of difference when trying to debug.
With dplyr | With magritter | With data.table |
---|---|---|
hourly_delay <- | hourly_delay <- | hourly_delay <- |
filter( | flights %>% | flights[ |
summarise( | filter(!is.na(dep_delay)) %>% | !is.na(dep_delay), |
group_by( | group_by(date, hour) %>% | .( |
filter( | summarise( | delay = mean(dep_delay), |
flights, | delay = mean(dep_delay), | n = .N), |
!is.na(dep_delay)), | n = n() ) %>% | by = .(date, hour)][ |
date, hour), | filter(n > 10) | n > 10] |
delay = mean(dep_delay), | ||
n = n()), | ||
n > 10) |
hourly_delay <-
filter(summarise(group_by(filter(flights, !is.na(dep_delay)), date, hour),
delay = mean(dep_delay), n = n()), n > 10)
hourly_delay <-
flights %>%
filter(!is.na(dep_delay)) %>%
group_by(date, hour) %>%
summarise(delay = mean(dep_delay), n = n) %>%
filter(n > 10)
hourly_delay <-
flights[!is.na(dep_delay),
.(delay = mean(dep_delay), n = .N),
by = .(date, hour)][n > 10]
These options allow the user to set and examine a variety of global options which effect the way in which R computes and displays its results.
Most options are not set by default (NULL), but R has a list of standard options on the help website. Also, you can set your own custom options if you wish. This can be useful if you want your custom functions to change their behavior in a global sense.
A quick way to look at all the current option settings:
.Options
str(.Options)
To get a single option setting use getOption("<option>")
. This is often used in the formals / prototype / arguments of function – more on that later.
options()
can receive either a named vector or named list, therefore many options can be set at the same time.
For example:
options(verbose = TRUE)
options(warn = 1)
options("error" = NULL)
option_list <- list(warn = 1, error = NULL)
options(option_list)
For more detailed information read the guides R-studio Debugging & Debugging, condition handling, and defensive programming are helpful.
options(showWarnCalls = TRUE)
if you wish.test_it <- function() {inner_fn <- function() {stop("test it")} ; inner_fn()}
test_it()
> Error in inner_fn() : test it
> Calls: test_it -> inner_fn # Added the call trace
test_it <- function() { # Line 1 Relative from the functions starting
a <- "apple" # Line 2
inner_fn <- function() { # Line 3
b <- "bungalow" # Line 4
stop("test it") # Line 5 refers to the (from #5)
} # Line 6
inner_fn() # Line 7
} # Line 8
test_it()
> Error in inner_fn() (from #5) : test it # "(from #5)" is added.
> Calls: test_it -> inner_fn
dump.frames(dumpto = "last.dump", to.file = FALSE)
debugger(dump = last.dump)
which will call the dumped environment as if one was using recover()
.traceback()
(shows the function nesting structure) two levels up from the scope in which the error was generated showing, yet does not load debugger.Sometimes you might want to change the options within a function temporarily.
Using on.exit()
can be a convenient way to undo whatever changes you might have made within the function. This also works for global objects.
test_fn <- function(x) {
options(digits = 4)
print(x)
on.exit(options(digits = 12))
}
> test_fn(pi) # 3.142
> print(pi) # 3.14159265359
Undoing temporary changes is really important if you want to avoid writing code that will surprise / sabotage you later.
When writing functions, the better sense you have of how and where it will be used, the better chance you have of writing a bug free or easily debugged function.
You also want to have a good sense of what data is entering the function, and what results are returned (if any) once completed.
R functions can be categorized into 4 general tiers or levels of application.
For the purposes of this discussion, context refers to any idiosyncratic analysis scenario of which the analysis methods used are assumed, common, and/or particular to the area applied. Some examples of different contexts include, but are not limited to, fields of science, governments, companies, projects, and individuals, etc. a process can be simplified and streamlined for in-context use.
These 4 function structures can be explained with a tool analogy.
cut
, %in%
, intercept
, setdiff
, Reduce
, mean
, sd
, etc.lm
, summary
, describe
{psych}, print
, plot
, as.data.table
{data.table}, train
{caret}.bincode
(from cut), .Internal(match(
(from %in%), forceAndCall(
(from Reduce, etc.), .Internal(mean(
(from mean), .Call(C_cov
(from sd), .Call(C_Cdqrls
(from lm), +
, if
, etc.context_fun <- function() {
process_fun <- function() {
simple_fun <- function() {
.binary_fun <- function() {
}
}
}
}
For the purposes of this presentation the term “squawking” will refer to any printing, messages, warnings, or errors rendered in R.
“Not all problems are unexpected. When writing a function, you can often anticipate potential problems (like a non-existent file or the wrong type of input). Communicating these problems to the user is the job of conditions: errors, warnings, and messages.”
- Hadley Wickham [Advanced R]
stop()
function execution.try()
and tryCatch()
to “test” whether an error will occur as a result.\n
(newline), \t
(tab), \r
(carriage return), etc. to format outputs.print()
can be somewhat cleaner than using messages because print results are usually easier to read.flush.console()
in order display the output to the console window.paste()
and paste0()
are used for joining strings together. Learning to use the arguments collapse
and sep
is particularly important.sprintf()
is a powerful string creation tool that allows you insert and format multiple numbers and strings into a one string using a special syntax. Used example("sprintf")
for more information.format()
is a commonly used function designed to format particular objects that have alternative forms such as with the date and time classes.prettyNum()
and formatC()
use C style format specifications.Every so often you will come across a function that squawks for reasons that you have chosen to ignore. When this happens you might wish to suppress any future interruptions if only to make other notices less cluttered on the screen. You can use capture.output
, suppressMessages
, and suppressWarnings
to accomplish this, however, use it sparingly. Suppress and capture functions are not selective about what messages they suppress. If you can manage to suppress only the conditions you want to ignore, you will be better off.
adele <- function(x) {
options(warn = 1) ; on.exit({options(warn = 0)}, add = TRUE)
cat("Hello, how are you?\n")
message("It's so typical of me to talk about myself I'm sorry.")
warning("I hope that you're well.")
as.numeric("Nope")
if (!missing(x))
stop("Did you ever make it out of that town where nothing ever happened?")
invisible("Yep")
}
adele() # With no suppressed squawking
# With the console output suppressed and saved in lyric, and the function's
# output saved as answer.
lyric <- capture.output({answer <- adele()}) ; lyric ; answer
# With all messages suppressed
suppressMessages(adele())
# With all warnings suppressed
suppressWarnings(adele())
# Function used to ONLY suppress the "NAs introduced by coercion" leaving all
# other warnings and errors through.
suppressNAwarn <- function(expr) {
withCallingHandlers(expr,
warning = function(w) {
if (grepl("NAs intro", w$message))
invokeRestart("muffleWarning")})}
suppressNAwarn(adele())
It is very common practice for programmers to provide a means of suppressing unnecessary messages or warnings from being displayed. This is most often accomplished by adding a âverboseâ argument at the end of the formals pairlist.
verbose
is a boolean option in R that is set to FALSE by default, and is referenced using the getOption("verbose")
command. It is used by placing an if(verbose) {}
in front of any message you would like to keep quiet by default. When debugging a process it can be helpful to display the content passing through such functions even though the actual issue might arise before or after the function in question. Many functions in base R have this feature built in.
loquacious_function <- function(person1 = 0, person2 = 0, data = NULL,
envir = parent.frame(),
verbose = getOption("verbose")) {
if (verbose) {
message("Hi, my name is Mr. Loquacious, and your 'person1' and 'person2'",
" values are, ", person1, ", ", person2, ", respectively.")
}
if (verbose && person1 >= 4) {
warning("Be careful, your 'person1' input value is ", person1,
" popped-collars cool. This could lead to issues with ",
"later date results.")
browseURL("http://www.maniacworld.com/four-popped-collars-cool.jpg")
}
if (person2 <= person1 -3) {
warning("The person1 variable 3 or more popped-collars greater",
" than the person2 variable. Reliable date results",
" cannot be obtained.")
}
if (person2 > person1) {
stop("An impossible sitation has arisen.",
" Please rechecking input values is suggested.")
}
(date_prob <- 0.5 - (person1 - person2) / 4)
}
loquacious_function(4,0)
Shiny has its own analogous verbose option called shiny.trace – options(shiny.trace = FALSE)
– and it is set to FALSE as default. Moreover, sometimes shiny squawks behind the scenes. That is, issues can arise during a period when shiny is communicating with the hosted webpage. In order to debug such problems the command debug(httpuv::service)
can be helpful.
As mentioned before, there are different types of functions and each function has its own requirements for what is taken in and returned. It is the programmer’s responsibility to make sure that a function either works as expected or communicates the nature of the issue as soon and effectively as possible.
context_fun <- function() {
process_fun <- function() {
simple_fun <- function() {
.binary_fun <- function() {
}
}
}
}
That is, it is better to identify issues higher and earlier up in the nested function structure, however, this can require a lot of foresight and planning beforehand.
?<function>
and help("<function>")
example("<function>")
can be a slick convince shortcut to figure out what how to use a function quickly.str(<object>)
can save you a lot of headache when you are trying to figure out what and where the sought after pieces of a complex object reside.
example("str")
is very illuminating.<function>
methods(<function>)
and getAnywhere("<function>.<method>")
when you are having trouble finding code. For example try ?methods
, methods("mean")
, and getAnywhere("mean.default")
look up R method code.I find Ctrl+shift+F to be critically important:
Because context functions are being applied to very specific situations it is usually useful to make sure whatever qualities that are unique to the context are intact within the input.
context_fun <- function(x, ...) {
if(!inherits(x, what = 'custom_bio_object'))
stop("\nOnly objects generated by 'get_some_data' from the
{custom_bio} package can be used in this function.")
if(length(x) > 1000)
warning("The custom_bio_object supplied is very large, and the results
might take a few hours to process.")
message("Process time estimated to be ", time_estimate, "hours.")
process_fun <- function(...) {...
Not all functions have to be named. If you believe a function will only be applied once in a particular context you do not need to name it anything memorable, and you can keep using the same name repeatedly.
# Anonymous function
function() NULL # meant to be ephemeral
# Example of lapply with an anonymous function applied to each item in the list.
lapply(some_list, function(x) as.vector(unlist(bootstrap(boundary_trim(x)))))
# This can get somewhat hard to read though. For this reason, I usually like to
# temporarily assign a function to 'f'.
# Quasi-anonymous function
f <- function(x) as.vector(unlist(bootstrap(boundary_trim(x))))
# or
f <- function(x) boundary_trim(x) %>% bootstrap %>% unlist %>% as.vector
lapply(some_list, f)
One benefit of this is that the code can be much clearer to the reader. Additionally, you are much less likely to make mistakes with parenthesis or copy-pasting when your expression is simplified with f()
.
The temporary assignment method for quasi-anonymous functions has an unexpected benefit. Using the assignment operator <-
to reassign a temporary / holder value can sometimes create problems. This problem often arises when loops (such as for, while, and repeat) are use with holder objects that are intended to be reassigned / replaced / overwritten by a subsequent loop iteration. These holder objects do not always receive the new object due to a malfunction within the loop. This is why it is common to see loops with object1<-object2<-object3<-NULL
at the beginning or end of their loop closure because re-initializing insures subsequent iterations will not use outdated objects. The nice thing about f <- function(){}
is that it will reliably re-assign f()
or you will receive an error.
holder_object <- NULL
for(i in 1:3) {
object1 <- func1(holder_object)
object2 <- func2(object1)
object3 <- func3(object2)
holder_object <- funcy(object3)
object1 <- object2 <- object3 <- NULL
}
Loops often get a bad rap in R because they are widely considered to be slow, difficult to manage, and error prone. Out of these three criticisms the only assertion I partly agree with is that loops can sometimes be harder to manage.
ask <- function (envir = parent.frame()) {
cat("\rAre you ready to debug yet (y or n)? [i =", i, ']')
ans <- readLines(con = stdin(), n = 1)
if (length(intersect(ans, c("y")))) with(envir, {browser()})
return(invisible(NULL))
}
for (i in iterseq <- seq(3L)) {
ask() ## It can ask you
if (i <= 1) print(Sys.time()) ; flush.console()
if (i == 2) message("Last run")
if (i >= max(iterseq)) browser() ## Conditional browsing
warning("For loop warning")
}
In such situations using break points are not convenient because you might not want the debugger to run until after some number of iterations have been completed.
The if (TRUE) browser()
method also works for function under situations in which abnormalities are hard to duplicate in a contrived manner.
In addition to input validation, process functions typically go through many layers of functional checks and analyses whereby internal results can trigger decisions as to how to continue towards the functions’ intended purpose. It can be helpful for these functions to (1) tell the user what decisions have/are being made, (2) provide progress reports, (3) ask the user how to proceed, (4) warn the user about dubious result, and (5) explain why a process failed to achieved a result.
...
process_fun <- function(x1, ...) {
# Step 1: Decide how to proceed based on argument and data specifics.
# Step 2: Test to see if the input adheres the to required format.
# Step 3: Coerce data structure if necessary.
# Step 4: Test if resultant coerced data is viable.
# Step 5: Choose what simple function(s) are necessary to derive a result.
# ...
# glm() advanced example of a powerful process function
simple_fun <- function(x2, ...) {
# glm.fit() is an good example of a simple function that borders on
# process level complexity, and it is function that lm() actually calls to
# start the calculation.
...
Their role is usually as simple as coercing and validating input information, while reporting anomalies or malfunctions. Where these functions can get complicated is when argument settings can effect how coercion and validation is performed, or even which .binary function(s) are used.
simple_fun <- function(x, na.rm = FALSE, unlists = FALSE, scaler = 1) {
# Step 1: Test, coerce, and validate inputs.
if (any(is.na(x))) stop("NA detected within vector, and simple_fun cannot
complete calcuations unless na.rm = TRUE is set.")
# Step 2: Reformat results based on function arguements, and revalidate.
# Step 3: Select what optimized binary function(s) is selected.
.binary_fun1 <- function(x2) {...}
.binary_fun2 <- function(x2) {...}
...
}
Using descriptive arguments labels can help make setting selections more intuitive, however, shorter argument names are more convenient when you are trying to keep your lines short. Loading function arguments by position can simplify your code, but may effect referential integrity.
I sometimes find it useful to either set default function arguments to fictitious objects names which I do not intend to defining. Sometimes you can pick the fictitious object name to just be an abbreviated version of what intended object input – up_src_ext_vec could be uploaded_sourced_external_vector.
I use these names as an extra reminder as to what a function argument does and how to debug it, before any formal documentation exists. Moreover, providing a number of options in your function’s arguments can be a helpful reminder of the acceptable options.
simple_fun <-
function(data, FUN = Vectorized_delimitation, # VD short description
recode_key = string_or_list_format,
# RK long drawn out and usually useless description
na.rm = c(TRUE, FALSE, "mean_replace",
"median_replace", "twod_interpolate"),
bin_method = c("inf_right", "inf_left", "inf_bounds",
"inclusive", "exclusive", "95ci", ".bincode"),
rtrn_objects = c("simple_result", "summary", "ci", "bin_vec")
...) { # ... pass to .bincode
na.rm <- match.arg(type)
bin_method <- match.arg(bin_method)
rtrn_objects <- match.arg(rtrn_objects, several.ok = TRUE)
# match.args's 'several.ok' argument is useful for simplifying code by
# reducing the amount of arguments necessary when a number of options are
# not mutually exclusive.
if (bin_method == ".bincode") bin_method <- .bincode(...)
...
output <- list()
if (pmatch("res", rtrn_objects, 0L)) output$result <- final_rslt
if (pmatch("sum", rtrn_objects, 0L)) output$summary <- info_smry
if (pmatch("ci" , rtrn_objects, 0L)) output$conf_int <- confint_rslt
if (pmatch("bin", rtrn_objects, 0L)) output$bin_vec <- bc_rslt
return(output)
}
Referencing the actual default object name is sometimes easier (especially with functions inside of functions), however, I find myself prone to either: 1. Unwittingly changing the referenced object’s name thereby breaking the function. 2. Forgetting an argument even exists while I am trying to use the function in another script.
I often find reading chained if-else statements difficult to read and debug. This is especially the case when those if-else closures have even more if-else statements nested within them. People often have trouble keeping such code organized, and it only takes little mistakes to be a huge problem later on.
For this reason it can sometimes be helpful to use switch statements instead of if-else chains.
x <- FALSE
choice <-
if(is.na(x)) {
"bop_it"
} else if(x > 0) {
"twist_it"
} else if(is.numeric(x)) {
if(x < 0) {
"pull_it"
} else {
"flip_it"
}
} else if(is.name(x)) {
"shout_it"
} else if(is.character(x)) {
"throw_it"
} else "ifelse_dropout"
result <- switch(
choice,
"bop_it" = {1},
"twist_it" =, "pull_it" = {
choice <- NULL
(x + 1)
},
"shout_it" = {3},
"throw_it" = {4},
{0})
With .binary functions you want to limit the errors to situations where there is some mathematically necessary requirement such as avoiding division by zero, or unequal vector lengths, etc. It should be noted that binary errors are the most difficult to debug because of the depth of their nesting. Therefore, adding errors into binary functions should be considered minimally sufficient error reporting – better than a stock C++ error at least. The above example, uses tryCatch
to help make the explanation for the error more clear.
simple_fun <- function() {
...
.binary_fun <- function() {
if(denominator == 0) {
stop("Division by zero produces empty set")
} # Often written in C++
}
result <- tryCatch(.binary_fun(), error = function(e) e)
if (inherits(result, what = "simpleError"))
stop("The function f()", as.list(sys.call())[[1]],
" has recieved data from which it is impossible to
derive a finite answer.\n", result)
...
}
Out of all the debugger functions R-Studio has to offer, 95% of the time I use is debugonce()
. It is simple and easy, and you don’t have to turn it off like debug()
and undebug()
.
f1 <- function(x) paste0(x,"_f1")
f2 <- function(x) paste0(x,"_f2")
f3 <- function(x) paste0(x,"_f3")
debugonce("f3")
f1(f2(f3("Start")))
The operators <<- and ->> are normally only used in functions, and cause a search to made through parent environments for an existing definition of the variable being assigned. If such a variable is found (and its binding is not locked) then its value is redefined, otherwise assignment takes place in the global environment.
deeper <- function(x) {movie <<- "Office Space" ; paste0(x,"_deeper")}
and_deeper <- function(x) paste0(x,"_and_deeper")
way_down <- function(x) paste0(x,"_way_down")
way_down(and_deeper(deeper("relax")))
movie
So <<-
can be used to save one or more objects for temporary viewing much like last.dump does, however, it is important to make sure the name you choose is unique. Using <<-
for this reason assigns the chosen object in the search path so all child / nested environments will be able reference it. This can create non-obvious dependencies between functions, so <<-
is usually only considered safe within functions.
From closures:
new_counter <- function() {
i <- 0
function() {
i <<- i + 1
i
}
}
new_counter()
Sometimes it is easier / quicker to use index numbers when reference items inside of an object, however, in the long run using labels is far easier to read because over time we forget what the index numbers are referencing. The following are some example dos and dont’s:
library("data.table")
ivec <- setNames(seq(26L), letters[])
l <- list(u_case = LETTERS[], l_case = letters[], int = seq(26L),
num = seq(0, 1, length.out = 26))
dt <- as.data.table(l) ; df <- as.data.frame(l) # mtx <- as.matrix(df)
# Integer vector
ivec[letters[1:4]] OR ivec[['c']] NOT ivec[1:4] OR ivec[[3]]
# Mixed list
l[c('int','num')] OR l[['l_case']] NOT l[c(3, 4)] OR l[[2]]
# data.table
setkey(dt, u_case)
# data.table by setkey and column name.
dt[c('A', 'Z'), c('l_case', 'num'), with = FALSE] # OR
dt[c('A', 'Z'), .SD, .SDc = c('l_case', 'num')]
# NOT
dt[c(1, 26), c(2, 4), with = FALSE] # OR
dt[c('A', 'Z'), .SD, .SDc = c(2, 4)]
# data.frame
row.names(df) <- LETTERS[]
df[c('A','Z'), c('l_case', 'num')]
# NOT
df[c(1, 26), c(2, 4)]
Sometimes it is fine to use indexes, especially when items inside of an object are not named and in an arbitrary order. However, at a minimum you should try to at least label the index vector with something descriptive.
the_cool_kids <- c(1, 3, 5, 7, 9)
math_class_students <- c('Chad', 'Albert', 'Trent', 'Niels', 'Brad',
'Leonardo', 'Guy', 'Nikola', 'Brody')
math_class_students[the_cool_kids]
Coding in sections using Code Folding and Sections like #### ####
can be really helpful for organization.
The commands Ctrl+Alt+B and Ctrl+Alt+T can make it really easy to test changes.
Also, comments should explain the ‘why’, not the ‘what’. It is not a good idea to flood your code with useless comments because you – or the person trying to read your code – will be saturated with obvious comments and might miss the comments that are really important.
Don’t be afraid of NA (NA_integer_, NA_real_, NA_complex_, NA_character_), NaN, missing, or Inf values.
Non-finite values can result from code malfunctions, but they are often just placeholders for values that never existed or couldn’t be calculated. When problems arise, the type and location of an object’s non-finite values can provide valuable clues as to what might be going wrong. It is better to write functions that handle unacceptable values appropriately instead of always omitting them. At a minimum you can use functions like na.omit()
, which pass result attributes that could be captured and displayed as messages or warnings.
x <- c(1, 2, NA, 5)
na.omit(x)
# [1] 1 2 5
# attr(,"na.action")
# [1] 3
# attr(,"class")
# [1] "omit"
Many functions are even designed to target NA values for imputation such as na.approx()
, na.spline()
, and the excellent multiple imputation package {mi}
R-Studio’s Official Debugging Guide
Hadley’s Advanced R “companion website” to the book
Advanced R Chapter: Exceptions and debugging
What Can We (R Programmers) Learn from Software Engineers?
Predictive Analytics World (You will have to register for a free account)
Google’s R Style Guide (I personally divert from this quite a bit, but it is a nice starting point.)
Debugging data.table?
Debugging parallel processes?
Advanced environment management?