Arrays in R are among the most important data structures for managing large, data objects. Arrays generalize matrices by extending the .Dim slot to more than two dimensions. The script below convert’s Anderson’s Iris data from a rectangular data frame to a 3-dimensional array and prints the first 2 of 50 rows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# Initialize names and array 3D.names <- c("setosa", "versicolor", "virginica") 2D.names <- c("Sepal.L", "Sepal.W", "Petal.L", "Petal.W") my.iris <- array(0, dim=c(50, 4, 3), dimnames=list(1:50, 2D.names, 3D.names)) # Define matrices by species my.setosa <- as.matrix(iris[which(iris[, "Species"] == "setosa"), ][, 1:4]) my.versicolor <- as.matrix(iris[which(iris[, "Species"] == "versicolor"), ][, 1:4]) my.virginica <- as.matrix(iris[which(iris[, "Species"] == "virginica"), ][, 1:4]) # Populate Array my.iris[,,"setosa"] <- my.setosa my.iris[,,"versicolor"] <- my.versicolor my.iris[,, "virginica"] <- my.virginica # Print data my.iris[1:2, , ] , , setosa Sepal.L Sepal.W Petal.L Petal.W 1 5.1 3.5 1.4 0.2 2 4.9 3.0 1.4 0.2 , , versicolor Sepal.L Sepal.W Petal.L Petal.W 1 7.0 3.2 4.7 1.4 2 6.4 3.2 4.5 1.5 , , virginica Sepal.L Sepal.W Petal.L Petal.W 1 6.3 3.3 6.0 2.5 2 5.8 2.7 5.1 1.9 |
Creating Arrays in R
To create an array in R, use the array() function, which takes data and the dim argument as inputs. If no data is supplied, the array is filled with NAs. When passing values to array(), use vectors or matrices as only:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
> array(c(1:8,11:18,111:118),dim=c(2,4,3)) , , 1 [,1][,2][,3][,4] [1,] 1 3 5 7 [2,] 2 4 6 8 , , 2 [,1][,2][,3][,4] [1,] 11 13 15 17 [2,] 12 14 16 18 , , 3 [,1][,2][,3][,4] [1,] 111 113 115 117 [2,] 112 114 116 118 |
The first dimension (rows) is incremented first, placing the values column by column. The second dimension (columns) is incremented second. The third dimension is incremented next, and so on until dimensions are defined.
The dim() function works for arrays in the same way it works for matrices when folding vectors into arrays. The dim() function lets you set the .dim slot. For example, if the input data above was stored in the vector vec, the above array is created by defining the .dim slot with the vector c(2, 4, 3):
1 2 3 4 5 6 |
> vec [1] 1 2 3 4 5 6 7 8 11 12 13 [12] 14 15 16 17 18 111 112 113 114 115 116 [23] 117 118 > dim(vec) <- c(2, 4, 3) |
To name each level of each dimension, use the dimnames argument to array(). This passes a list of names in the same way as is done for matrices:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# Define array input data > a <- 1:24 > dim(a) <- c(3,4,2) > dimnames(a) <- list(1:3, letters[1:4], c("Level 1","Level 2")) > a , , Level 1 a b c d 1 1 4 7 10 2 2 5 8 11 3 3 6 9 12 , , Level 2 a b c d 1 13 16 19 22 2 14 17 20 23 3 15 18 21 24 |
Array Subscripting
Data extraction and replacement of array elements via subscripting is similar to matrices and relies on the [] infix operator. Using the example, above, the first row of the level 1 data is extracted as follows:
1 2 |
> a[1, , 1] [1] 1 4 7 10 |
More complex subscripting is also possible. The following example uses a matrix to extract and then replace multiple elements from several locations of a two dimensional array:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# Define and print the array > x <- array(1:20, dim=c(4,5)) > x [,1] [,2] [,3] [,4] [,5] [1,] 1 5 9 13 17 [2,] 2 6 10 14 18 [3,] 3 7 11 15 19 [4,] 4 8 12 16 20 # Generate and print the index matrix > y <- array(c(1:3,3:1), dim=c(3,2)) > y [,1] [,2] [1,] 1 3 [2,] 2 2 [3,] 3 1 # Extract array elements using the index matrix > x[y] [1] 9 6 3 # Replace array elements using the index matrix > x[y] <- 0 > x [,1] [,2] [,3] [,4] [,5] [1,] 1 5 0 13 17 [2,] 2 0 10 14 18 [3,] 0 7 11 15 19 [4,] 4 8 12 16 20 |
Arrays as Contingency Tables
Contingency tables are unique arrays defined by the number of arguments given, and the tables present summary information based on the number of combinations defined by the observed data. The table() function will create a contingency table, voiding significant coding:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# Define observed data > pet <- c("cat", "dog", "cat", "dog", "cat", "?") > food <- c("dry", "dry", "dry", "wet", "wet", "wet") > eat.time <- c("morn", "morn", "noon", "noon", "night", "night") # Generate contingency table and print array summary > table(pet, food, eat.time) , , morn dry wet ? 0 0 cat 1 0 dog 1 0 , , night dry wet ? 0 1 cat 0 1 dog 0 0 , , noon dry wet ? 0 0 cat 1 0 dog 0 1 |
Outer Product and Infix Flexibility
An important operation involving arrays in R is the outer product. If a and b are two arrays, their outer product is an array whose dimension vector is is obtained by concatenating their two dimension vectors (order is important), and whose data vector is obtained by forming all possible products of elements of the data vector a with those of b (again order is important). For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
# Define input matrices > a <- matrix(1:4, ncol = 2) > b <- matrix(rep(10, 4), ncol = 2) # Generate outer product and print > ba <- b %o% a > ba , , 1, 1 [,1] [,2] [1,] 10 10 [2,] 10 10 , , 2, 1 [,1] [,2] [1,] 20 20 [2,] 20 20 , , 1, 2 [,1] [,2] [1,] 30 30 [2,] 30 30 , , 2, 2 [,1] [,2] [1,] 40 40 [2,] 40 40 Interrogate array dimension > dim(ba) [1] 2 2 2 2 |
An alternative execution is also possible and broadens the use of the outer product infix operator to assume infinite applications. The full argument structure of outer() is displayed below. The third element is the multiplication operator, which is the default function used:
1 |
> outer(b, a, "*") |
The multiplication function can be replaced by any arbitrary function of two variables (or its name as a character string). For example, if we want to evaluate the function:
f(a, b) = sin(a)b2
over a regular grid of values with x- and y-coordinates defined by the vectors a and b, then we could use:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
#Define function > my.fun <- function(b, a) {sin(a) * b^2} # Redefine outer product function > outer(b, a, my.fun) , , 1, 1 [,1] [,2] [1,] 84 84 [2,] 84 84 , , 2, 1 [,1] [,2] [1,] 91 91 [2,] 91 91 , , 1, 2 [,1] [,2] [1,] 14 14 [2,] 14 14 , , 2, 2 [,1] [,2] [1,] -76 -76 [2,] -76 -76 |