R Arrays

Arrays in R are among the most important data structures for managing large, data objects.  Arrays generalize matrices by extending the .Dim slot to more than two dimensions.  The script below convert’s Anderson’s Iris data from a rectangular data frame to a 3-dimensional array and prints the first 2 of 50 rows:

Creating Arrays in R

To create an array in R, use the array() function, which takes data and the dim  argument as inputs. If no data is supplied, the array is filled with NAs. When passing values to array(), use vectors or matrices as only:

The first dimension (rows) is incremented first, placing the values column by column. The second dimension (columns) is incremented second. The third dimension is incremented next, and so on until dimensions are defined.

The dim() function works for arrays in the same way it works for matrices when folding vectors into arrays. The dim() function lets you set the .dim slot. For example, if the input data above was stored in the vector vec, the above array is created by defining the .dim slot with the vector c(2, 4, 3):

To name each level of each dimension, use the dimnames argument to array(). This passes a list of names in the same way as is done for matrices:

Array Subscripting

Data extraction and replacement of array elements via subscripting is similar to matrices and relies on the [] infix operator.  Using the example, above, the first row of the level 1 data is extracted as follows:

More complex subscripting is also possible.  The following example uses a matrix to extract and then replace multiple elements from several locations of a two dimensional array:

Arrays as Contingency Tables

Contingency tables are unique arrays defined by the number of arguments given, and the tables present summary information based on the number of combinations defined by the observed data.  The table() function will create a contingency table, voiding significant coding:

Outer Product and Infix Flexibility

An important operation involving arrays in R is the outer product.  If a and b are two arrays, their outer product is an array whose dimension vector is is obtained by concatenating their two dimension vectors (order is important), and whose data vector is obtained by forming all possible products of elements of the data vector a with those of b (again order is important).  For example:

An alternative execution is also possible and broadens the use of the outer product infix operator to assume infinite applications. The full argument structure of outer() is displayed below. The third element is the multiplication operator, which is the default function used:

The multiplication function can be replaced by any arbitrary function of two variables (or its name as a character string).  For example, if we want to evaluate the function:

f(a, b) = sin(a)b2

over a regular grid of values with x- and y-coordinates defined by the vectors a and b, then we could use:

Back | Next