Category Archives: R Data Syntax

Data Expressions in R

Data Expressions

The following list defines data expressions in R that are used to compute basic numerical results for scalars, vectors, and rectangular data objects.  Scroll through the table to see all functions:

FunctionDescriptionComment
abs()Absolute valuen/a
approx()Linear interpolation of pointsn/a
asin(); acos(); atan()Inverse trigonometric functionsn/a
asinh(); acosh(); atanh()Inverse hyperbolic functionsn/a
ceiling()Round up to nearest integerImpacts stored precision
Posted in R Basics, R Data Syntax | Comments Off on Data Expressions in R

Data Concatenation and Coercion in R

Data concatenation and coercion are common operations in R.

Data Concatenation

The concatenate c() function is used to combine elements into a vector.

When elements are combined from different classes, the c() function coerces to a common type, which is the type of the returned value:

Posted in R Basics, R Data Syntax | Comments Off on Data Concatenation and Coercion in R

Data Formatting in R

There are a number of ways to accomplish data formatting in R.

Data Options in R

R supports a range of data formats and controls.  The options() function accesses the default settings R establishes at start-up.  Session options that can be changed from the command line include:

Each of these variables can be changed to modify R performance.  For more details on each element see the HTML help for the options() function.  A practical example is given below.

Posted in R Basics, R Data Objects, R Data Syntax | Comments Off on Data Formatting in R

Data Infix Operators in R

Intro to Infix Operators in R

postfixInfix operators in R are unique functions and methods that facilitate basic data expressions or transformations.  

Infix refers to the placement of the arithmetic operator between variables.  For example, an infix operation is given by (a+b), whereas prefix and postfix operators are given by (+ab) and (ab+), respectively.  

The types of infix operators used in R include functions for data extraction, arithmetic, sequences, comparison, logical testing, variable assignments, and custom data functions. 

Posted in R Basics, R Data Syntax | Comments Off on Data Infix Operators in R

Factors in R

Categorical (e.g. qualitative) data are represented as factors in R.  Factors display as character strings (e.g. labels), but are stored as integers (e.g.  levels).

Creating Factors in R

Factors may be created by using the factor() or as.factor() function:

Note that it is not possible to assign labels to the factor levels within the function as.factor().

Another way to create factors in R is to split a data object into category groups and then call the factor() function:

Posted in R Data Objects, R Data Syntax | Comments Off on Factors in R

Tidy Data Transformations

Package Dependencies

The core packages for tidy data transformations are listed below:

The dplyr package is by far the most important of the packages in the “tidyverse” for data transformation and manipulation.1  Verb-based functions are one of the advantages of the package.  The syntax is much easier to use when compared to the cryptic syntax of base R.

Posted in Data Science, R Basics, R Data Syntax, R Programming | Comments Off on Tidy Data Transformations

R Data Syntax

The following pages introduce the fundamentals of R data syntax for program scripting and quantitative data analysis.  

Back | Next

Posted in R Data Syntax | Comments Off on R Data Syntax

Regular Expressions (RegEx) in R

In computing, a regular expression (abbreviated regexp) is a sequence of characters that forms a search pattern, mainly for use in pattern matching with strings.  The patterns are often a combination of text abbreviations, metacharacters, and wild cards.  Regular expressions are used for searching for objects, doing extractions, or find/replace operations.  The use of regular expressions offers convenience and can have powerful impact on data or object management.

regexp Functions in R

Functions in R for regular expressions include:

Posted in R Data Syntax | Comments Off on Regular Expressions (RegEx) in R

Tidy Data Preparation

Package Dependencies

The core packages for tidy data preparation are listed below:

Of these, the tibble and tidyr packages are core to data consistency and preparation.1

Creating tibble Data

The tibble package provides a new data class for storing tabular data, the tibble. tibbles inherit the data.frame class, but improves 3 behaviors:

  • Subsetting – Always returns a new tibble, maintaining data consistency
Posted in Data Science, R Basics, R Data Objects, R Data Syntax | Comments Off on Tidy Data Preparation

Principles of Tidy Data

Introduction to Tidy Data

Despite the enormous amount of data available, there is surprisingly little alignment or information on how to create clean, consistent and easy to use data.

Human interface with data and code can benefit from some simple principles to facilitate repeatable research and results. The “tidy” approach to data requires that:

  • Data is structured consistently and reusable;
  • Code flow relies on simple function calls using the pipe;
Posted in Data, R Basics, R Data Objects, R Data Syntax, Scientific Computing | Comments Off on Principles of Tidy Data

Geospatial Data and Mapping in R

I share slides presented at a recent meeting of  Doha R users on geospatial data and mapping in R .

Geospatial Data and Mapping in R (109 downloads)

 

Posted in Data Science, R Data Objects, R Data Syntax, R Programming, Spatial Analysis | Comments Off on Geospatial Data and Mapping in R