R Data Subscripting

Intro to R Data Subscripting

Data subscripting in R is a key “motor skill” to extract data by row, column or element.  Subscripting is achieved using numeric, character, logical conditions or pattern matching.  Subscripting is also used to assign values to data object elements.

The syntax for data subscripting can take several forms depending on data structure and data object type. Examples are provided below.

Positive Index Values

Positive index values correspond to data element positions in a data object.  For example, the letters data object includes 26 lowercase letters in a vector.  The third letter is selected using a positive index value or subscript as follows:

Negative Index Values

Corresponds to the positions in the data object to be excluded:

Logical Index Values in R

True values correspond to the desired subset, false values to the excluded points:

Random Index Values in R

Random positions within the data object are used to flag which values are to be included (or excluded).  Wrapping the sort() function around the sample() function keeps the order intact:

Empty Indexing in R

Corresponds to the selection of all data objects.

Replacement Indexing in R

An additional use of subscripting is to replace data in an existing data object:

Equivalently, the replace() command can be used:

Appending Data in R

Appending data (e.g. to the end of a vector or via insertion) can also be done with subscripting or using the append() function:

Using Row and Column Names

Good coding practice relies on the use of element names, instead of index numbers to extract or assign data object elements.  Index numbers might achieve the task, but subscripting based on element names is easier to understand, especially when working with large data objects (e.g. many columns, rows or dimensions).

The following example shows two methods to extract by name:

Dropping columns by name is achieved as follows:

The subset command also takes logical operators; in this case to extract  rows with x > 3, and two columns only:

Back | Next