RDocco.R |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Introduction |
|
||||||||||
RDocco is an R-language port of Docco, the quick-and-dirty, literate-programming-style, documentation generator. It produces human-readable HTML that displays your comments alongside your code. |
|
||||||||||
RDocco supports Markdown, a lightweight markup language, for formatting of in-source comments. For example,
|
|
||||||||||
This script also automatically formats Roxygen2-style R comments, so that this function header |
|
||||||||||
|
|
||||||||||
renders as: |
|
||||||||||
Multiply two numbers togetherThis function calculates the product of two numbers.
|
|
||||||||||
|
|
||||||||||
This script was modeled after Jocco, a Docco port for the Julia language. Jocco uses external tools for syntax highlighting and Markdown rendering, but we opt for existing R-packages to handle these tasks. |
|
||||||||||
Presently, RDocco supports MathJax rendering of
\(\LaTeX\) math equations using the delimiters |
|
||||||||||
Actual Code CommentsLoad supporting packages. |
library("stringr") |
||||||||||
Initialize HTML templatesThe basic HTML structure of the output document is a 2–column table where
each row consists of a comment-code pairing. These three string constants
make an HTML sandwich. |
.HEADER <- [6081 chars quoted with '"'] |
||||||||||
Extract comments and code chunks from a .R file
|
.ParseSource <- function(src) { |
||||||||||
Read in the lines of the input script and initalize values. |
lines <- readLines(src) |
||||||||||
Regular expression for matching comments |
rx_COMMENT <- perl("^\\\\s*(?:(#)+(')?(\\\\s.*)?\\\\s*$)") |
||||||||||
Search for a comment match on each line of the input file. |
for (line in lines) { |
||||||||||
Extract lines that match the comment pattern. |
match <- str_extract_all(line, rx_COMMENT) |
||||||||||
If the line only matches an empty string, set it to an empty character vector. |
if (length(match) > 0 && match == "") { |
||||||||||
If there are no matches, then there is code-like text here. Store the line as code text. |
if (length(match) == 0) { |
||||||||||
Otherwise, if there is a match to the comment pattern, and the previous line contained code, update the code texts and reset initialization values. |
if (has_code) { |
||||||||||
Update the documentation with the current line. |
doc_line <- match |
||||||||||
Append the final code and documentation chunks to their text vectors. Trim off any whitespace from the documentation strings, but keep it in the code (for syntax indentation). |
code <- c(code, code_text) |
||||||||||
Replace pairs of empty lines in the code with a single empty line. |
rx_TWO_RETURNS <- perl("\\\\n\\\\s*\\\\n") |
||||||||||
Roxygen comments are a special kind of comment that start with |
rx_ROXYGEN <- perl("^(\\\\s*(#')+\\\\s*(.)*)+") |
||||||||||
Remove |
rx_INITIAL <- perl("^\\\\s*#\\\\s+") |
||||||||||
Highlight R codeWe can only highlight syntactically valid R code, and most of the individual chunks are not valid. If we put all the chunks back together, then we can highlight the R code (assuming the input file contained valid code). In order to break apart the highlighted code, we need to mark each place where we concatenated chunks together and then break up the code at those markers places. We also need to preserve newline characters in strings, so that multi-line strings are not collapsed onto super-long single lines.
|
.HighlightCode <- function(code_array) { |
||||||||||
Preserve escaped newline characters by marking them. |
sep_part_1 <- "#%BREAK%" |
||||||||||
Combine the chunks of R code together, marking where chunks are concatenated together. |
code_sep <- "# CUT HERE\\n" |
||||||||||
Write the code to a temporary file so it can be syntax-highlighted. |
con <- tempfile() |
||||||||||
Split the code at the newline-preserving markers and the chunk-concatenating markers. |
sep_in_comment <- str_c("<span class=\\"comment\\">", sep, "</span>") |
||||||||||
Parse a Roxygen commentThis function parses a Roxygen comment and extracts the first line of the comment, any introductory text, and the names and values of any @-tags in the comment. The parsing function below was adapted from the parsing function in Roxygen2. About RoxygenDocumentation for R packages takes place in special Roxygen-style comments seem like a pretty good practice for R coding, because they succinctly describe pertinent information about functions in a structured way. They are also supported by RStudio.
|
.ParseRoxygen <- function(lines) { |
||||||||||
Pattern that distinguishes a Roxygen comment from a normal comment. |
rx_LINE_DELIMITER <- '\\\\s*#+\\' ?' |
||||||||||
Does the string contain no matter beside spaces? |
.is.null.string <- function(string) { |
||||||||||
Grab lines starting with the Roxygen pattern and trim trailing (right-sided) whitespace |
delimited_lines <- lines[str_detect(lines, rx_LINE_DELIMITER)] |
||||||||||
Split at line breaks. |
lines <- unlist(str_split(delimited_lines, rx_LINE_DELIMITER)) |
||||||||||
Return an empty list if the trimmed lines are empty. Otherwise merge the lines together. |
if (length(trimmed_lines) == 0) return(list()) |
||||||||||
Split the Roxygen comment into @-labels chunks |
elements <- strsplit(joined_lines, '(?<!@)@(?!@)', perl = TRUE)[[1]] |
||||||||||
Parse the first line of a Roxygen comment block and any text that follows |
.ParseIntro <- function(expression) { |
||||||||||
Parse |
.ParseElement <- function(element) { |
||||||||||
Insert Roxygen-style comments into Markdown templatesRDocco applies Markdown formatting to the content of a Roxygen comment. Specifically, the first line of a Roxygen block is formatted as an Note: In order to write those @-signs in this Roxygen comment, I had to double up on them as an escape: TODO
|
.DressUpRoxy <- function(roxy_lines) { |
||||||||||
Support function for cleaning up Roxygen lines before formatting them |
.FetchLines <- function(line) { |
||||||||||
Remove line breaks |
line <- str_replace_all(line, "\\\\s*\\\\n\\\\s*", " ") |
||||||||||
Determine what kind of Roxygen comment each line is. Then start processing
lines by extracting and combining |
line_class <- unlist(attributes(roxy_lines), use.names = FALSE) |
||||||||||
Since we are combining elements into one |
roxy_lines[param_lines[-1]] <- NULL |
||||||||||
Extract |
line_class <- unlist(attributes(roxy_lines), use.names = FALSE) |
||||||||||
The first line of the Roxygen block gets formatted as an |
line_class <- unlist(attributes(roxy_lines), use.names = FALSE) |
||||||||||
Extract any |
line_class <- unlist(attributes(roxy_lines), use.names = FALSE) |
||||||||||
Since we are combining elements into one |
roxy_lines[TODO_lines[-1]] <- NULL |
||||||||||
All other Roxygen comments are considered paragraphs of an “introduction” text, so these are combined together. |
line_class <- unlist(attributes(roxy_lines), use.names = FALSE) |
||||||||||
Convert Markdown-formatted strings into HTML
|
.MarkItDown <- function(doc_array) { |
||||||||||
Generate literate documentation from Markdown-formatted comments in an R scriptThis is the main formatting function.
|
Doccofy <- function(src) { |
||||||||||
Extract title from filepath. Set the title in |
title <- unlist(str_split(src, "/")) |
||||||||||
Parse the input file into code chunks, plain-comment |
parsed <- .ParseSource(src) |
||||||||||
Format documentation text in each set of comments. Merge |
rdocs <- Map(.ParseRoxygen, rdocs) |
||||||||||
Replace escaped backslashes with HTML entities. This lets us do \(\LaTeX\), but we might (?) face problems rendering escape characters in our documentation. |
docs <- str_replace_all(docs, "\\\\\\\\", "\") |
||||||||||
Highlight code. Remove unnecessary line-breaks at the ends of code chunks. Preserve escape sequences in strings by doubling up on escape characters. |
code <- .HighlightCode(code) |
||||||||||
An alternative string replacement function that doesn't use regular expressions. |
str_replace_gsub <- function(string, pattern, replacement) { |
||||||||||
Plug the formatted documentation and highlighted code chunks together into
our |
formatted <- character(length(code)) |
||||||||||
Put our HTML sandwich together and output the final file. |
formatted <- str_c(formatted, collapse = "") |