Source code access is one of the great benefits of R. Source code is available for base R and over 5,000 open source packages. There are many reasons to view source code: to know what software does when documentation is vague or incomplete; to combine code objects in custom scripts or libraries; and to change source code as needed. The following post defines the different types of R source code available and how to access R sources.
Viewing Sources
The simplest way to view function source code is to type the function name (without parentheses) followed by enter:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
> array function (data = NA, dim = length(data), dimnames = NULL) { if (is.atomic(data) && !is.object(data)) return(.Internal(array(data, dim, dimnames))) data <- as.vector(data) if (is.object(data)) { dim <- as.integer(dim) if (!length(dim)) stop("'dims' cannot be of length 0") vl <- prod(dim) if (length(data) != vl) { if (vl > .Machine$integer.max) stop("'dim' specifies too large an array") data <- rep_len(data, vl) } if (length(dim)) dim(data) <- dim if (is.list(dimnames) && length(dimnames)) dimnames(data) <- dimnames data } else .Internal(array(data, dim, dimnames)) } <bytecode: 0x7f7fd1758240> <environment: namespace:base> |
This simple approach works well with R packages that have simple code structures. Unfortunately, the print results for many functions don’t return the actual source code, but instead return a call to the UseMethod command; or there’s a bit of source code available, but its obvious the real source lies elsewhere. In the latter case, it is typical that print results display .External, .Internal, .Primitive, .C, .Fortran, etc. when a call to another script is made For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
# Source for function 1 > mean function (x, ...) UseMethod("mean") <bytecode: 0x7f7fe25abbc8> <environment: namespace:base> # Source for function 2 > matrix function (data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL) { if (is.object(data) || !is.atomic(data)) data <- as.vector(data) .Internal(matrix(data, nrow, ncol, byrow, dimnames, missing(nrow), missing(ncol))) } <bytecode: 0x7f7fdda3eee0> <environment: namespace:base> > .Internal function (call) .Primitive(".Internal") > .Primitive function (name) .Primitive(".Primitive") |
Function objects with calls to additional sources can be followed and the process is relatively simple.
The S3 Method Dispatch System
Function objects which belong to the older S3 class are listed using the methods()
command. getAnywhere() will then display the source code for the named function. For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
> methods(mean) [1] mean.Date mean.default mean.difftime mean.IDate* mean.POSIXct mean.POSIXlt mean.times* mean.yearmon* [9] mean.yearqtr* mean.zoo* Non-visible functions are asterisked > getAnywhere(mean.times) A single object matching ‘mean.times’ was found It was found in the following places registered S3 method for mean from namespace chron namespace:chron with value function (x, trim = 0, weight = rep(1, length(x)), na.ok = TRUE, ...) { if (!missing(weight) && length(weight) != length(x)) stop(paste("weights must have same length as", deparse(substitute(x)))) att <- attributes(x)[c("format", "origin", "class")] nas <- is.na(x) if (!na.ok && any(nas, is.na(weight))) return(structure(NA, format = att$format, origin = att$origin, class = att$class)) if (na.ok) { x <- x[!nas] if (!missing(weight)) weight <- weight[!nas] } if (trim > 0) { if (trim >= 0.5) return(median(x)) n <- length(x) i1 <- floor(trim * n) + 1 i2 <- n - i1 + 1 i <- sort.list(x, unique(c(i1, i2)))[i1:i2] weight <- weight[i] x <- x[i] } if (any(weight < 0)) stop("weights must be non-negative") if (sm <- sum(weight)) out <- sum(unclass(x) * (weight/sm)) else out <- rep(0, length(x)) structure(out, format = att$format, origin = att$origin, class = att$class) } <environment: namespace:chron> |
The S4 Method Dispatch System
The S4 system is a newer method dispatch system and is an alternative to the S3 system. Here is an example of an S4 function:
1 2 3 4 5 6 7 |
> library(Matrix) Loading required package: lattice > chol2inv standardGeneric for"chol2inv" defined from package "base"function(x,...) standardGeneric("chol2inv")<bytecode:0x000000000eafd790><environment:0x000000000eb06f10> Methods may be defined for arguments: x Use showMethods("chol2inv")for currently available ones. |
The source code is not available, but the output offers a lot of information. The phrase standardGeneric is an indicator of an S4 function. The command to list S4 functions is showMethods():
1 2 3 4 5 6 7 8 |
> showMethods(chol2inv) Function: chol2inv (package base) x="ANY" x="CHMfactor" x="denseMatrix" x="diagonalMatrix" x="dtrMatrix" x="sparseMatrix" |
getMethod()
can also be used to see the source code of one of the methods:
1 2 3 4 5 6 7 8 9 |
> getMethod("chol2inv","diagonalMatrix") Method Definition:function(x,...){ chk.s(...) tcrossprod(solve(x))}<bytecode:0x000000000ea2cc70><environment: namespace:Matrix> Signatures: x target "diagonalMatrix" defined "diagonalMatrix" |
Syntax to view source code must also align to more complex signatures for each function method. For example:
1 2 3 4 5 6 7 8 9 10 |
require(raster) showMethods(extract) Function: extract (package raster) x="Raster", y="data.frame" x="Raster", y="Extent" x="Raster", y="matrix" x="Raster", y="SpatialLines" x="Raster", y="SpatialPoints" x="Raster", y="SpatialPolygons" x="Raster", y="vector" |
To see the source code for one of these methods the entire signature must be supplied.
1 |
getMethod("extract", signature = c( x ="Raster", y ="SpatialPolygons")) |
Compiled Package Code
If you want to view compiled code in an R package, you will need to download and unpack the package source. A package’s source code is available from the same CRAN repository that the package was originally installed from. The download.packages()
function can get the package source for you.
1 2 3 |
download.packages(pkgs ="solaR", destdir =".", type ="source") |
This will download the source version of the solaR package and save the corresponding .tar.gz
file in the current directory. Source code for compiled functions can be found in the src
directory of the uncompressed file. It is possible to combine the download and uncompress step into a single call (note that only one package at a time can be downloaded and unpacked in this way):
1 2 3 |
untar(download.packages(pkgs ="solaR", destdir =".", type ="source")[,2]) |
Alternatively, if the package development is hosted in a public archive (e.g. via GitHub, R-Forge, or RForge.net), you can browse the source code online.
Compiled Code in R Base
Certain packages are considered “base” packages. These packages ship with R and their version is locked to the version of R. Examples include base
, compiler
, stats
, and utils
. As such, they are not available as separate downloadable packages on CRAN as described above. Rather, they are part of the R source directory tree and the individual package directories can be found under /src/library/
.
Compiled Code in the R Interpreter
If you want to view the code built-in to the R interpreter, you will need to download and unpack the R sources; or you can view the sources online via the R Subversion repository.