R vectors are commonly applied in mathematics, science and engineering. A vector space is a structure formed by a collection of elements. The most common interpretation of a vector is to represent location in the space of real numbers. Similarly, vectors depict physical quantities that have both magnitude and direction, such as force or wind speed.

#### Initializing R Vectors

To “initialize” a vector is to declare its existence. A vector is initialized with the vector() function. The function takes two arguments: the first specifies the mode and the second specifies the length:

1 2 3 |
> x <- vector("numeric", 3) x [1] 0 0 0 |

*Creating R Vectors*

*Creating R Vectors*

Arbitrary values are combined to create a vector with the c() function.

1 2 3 |
vec <-c(1, 10, 100, 1000) > vec [1] 1 10 100 1000 |

Vector elements can also be entered interactively from the keyboard with the scan() function.

1 2 3 4 5 6 7 8 |
> scan(file = "", what = integer(), n = 5) 1: 1 2: 2 3: 3 4: 4 5: 5 [1] 1 2 3 4 5 |

Multiple values can be entered on each line. Input stops when an empty row is confirmed with *Enter*. Additionally, vectors can be created using the rep() function, or the seq() function, as seen previously.

Vector creation tips are summarized below:

Function | Description | Example |
---|---|---|

scan() | Reads values, any mode | scan(); scan("my.datafile") |

c() | Combine arbitrary values, any mode | c(1, 3, 2, 6); c("yes", "no") |

rep() | Repeates values, any mode | rep(NA, 5); rep(c(1, 2), 3) |

: | numeric sequence operator | 1:5; 3:-3 |

seq() | numeric sequence | seq(10); seq(1, 9, by = 2); seq(0, 1, length.out=11) |

vector() | Inititalize vectors | vector("complex", 5) |

logical() | initialize logical vectors | logical(3) |

integer() | Initialize integer vectors | integer(4) |

numeric() | Initialize numerical vectors | numeric(5) |

complex | Initialize complex vectors | complex(6) |

character() | initialize character vectors | character(7) |

*Naming Vector Elements*

*Naming Vector Elements*

You can assign names to vector elements to associate specific information, such as case labels or value identifiers, with each value of the vector. To create a vector with named values, you assign the names with the names() function:

1 2 3 4 5 |
> x <- vector("numeric", 5) > names(x) <- c("a", "b", "c", "d", "e") > x a b c d e [1] 0 0 0 0 0 |

Alternatively, more complex value arguments can be given:

1 2 3 4 5 6 7 8 9 |
> num.letters <- letters > names(num.letters) <- paste("obs", 1:26, sep="") > num.letters obs1 obs2 obs3 obs4 obs5 obs6 obs7 obs8 obs9 obs10 obs11 obs12 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" obs13 obs14 obs15 obs16 obs17 obs18 obs19 obs20 obs21 obs22 [2] "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" obs23 obs24 obs25 obs26 [3] "w" "x" "y" "z" |

In the above example, the first 26 integers are converted to character strings by the paste() function and then attached to each value. The quotes around the numbers are suppressed in the printing. R gives an error message if too many or too few names are specified relative to the number of values. Finally, double precision vectors with no names have class “numeric”, while logical and character objects with names have class “named”.

**Removing Vector Names**

**Removing Vector Names**

Data object names may be removed by assigning NULL to the names() function.

1 |
names(num.letters) <- NULL |

*Vectors With Missing Values*

*Vectors With Missing Values*

Vector components are not always known. When an element or value is “*Not Available*,” the vector element is reserved with the special value NA. In general, any operation on an NA becomes an NA. The function is.na(x) gives a logical vector of the same size as x with value TRUE for any elements NA. The expression sum(is.na(x)) counts the number of missing values, and which.na(x) returns a subscript index value identifying the position in the vector where NAs are located:

1 2 3 4 5 |
> z <- c(1:3,NA,5:6,NA) > sum(is.na(z)) [1] 2 > which.na(z) [1] 4 7 |

The function na.exclude() can be used to remove rows with NA values and to add comments for project control:

1 2 3 4 5 6 7 8 9 |
> z1 <- na.exclude(z) [,1] [1,] 1 [2,] 2 [3,] 3 [4,] 5 [5,] 6 attr(, "na.message"): [1] "Dropped 2 cases due to missing values" |

There is a second kind of “missing” value which is produced by numerical computation, the so-called “Not a Number,” or NaN values. Examples include division by zero and operations involving infinity. For clarity, is.na(x) is TRUE both for NA and NaN values. To differentiate, is.nan(xx) is only TRUE for NaNs. Missing values are sometimes printed as <NA>, when character vectors are printed without quotes.

*Missing Values in R and Logical Subscripting*

*Missing Values in R and Logical Subscripting*

Subscripting with logical operators is used below to eliminate missing values. The simple example is equally valid for application to large data objects:

1 2 3 |
# Create y with non-missing values of x > y <- x[!is.na(x)] x[is.na(x)] <- 0 |

The following functions define additional data logical tests for data structure manipulations:

Function | Description |
---|---|

is.finite(); is.infinite(); is.na(); is.nan() is.number() | Checks for finite, infinite, NA, NaN or numerical values and returns a logical vector |

which(); which.inf(); which.na(); which.nan() | Return location indices for TRUE values, for infinite values, for missing values and for not-a-number values |

sign() | Elementwise comparison with 0 values |

compare(0 | Elementwise comparison with 0 with another vector |

*Character Vectors*

*Character Vectors*

Character vectors are denoted by a sequence of characters delimited by the double quote character (e.g., “x-values”). Character strings are entered using either double (“) or single (’) quotes, but are printed using double quotes.

The paste() and paste0() functions are another way to create character vectors. For example:

1 2 3 4 |
> labs1 <- paste(c("X", "Y"), 1:10, sep="") [1] "X1" "Y2" "X3" "Y4" "X5" "Y6" "X7" "Y8" "X9" "Y10" > labs2 [1] "X1" "Y2" "X3" "Y4" "X5" "Y6" "X7" "Y8" "X9" "Y10" |

It is important to stress that there is no missing value code in character data. Instead, a whitespace between quotes is used.

A list of functions commonly used for manipulating character objects appears below:

Function | Description |
---|---|

abbreviate() | Abbreviate vector elements |

character(); as.character() | Coerce data to character strongs |

casefold() | Change characters to all lower/upper case |

charmatch(); match(); pmatch() | Match or partially match a character string |

grep() | Searches for patterns in character vectors and returns indices when a match is found. VERY USEFUL |

regexpr() | Same as above, but returns both index location and string length |

nchar() | Count the number of chaarcters in a string |

paste(); paste0(); unpaste() | Combine or separate character strings |

substring() | Extract part of a character string |