R Dates and Times

Preprocessing work to maintain R dates and times requires synchronize of data and formats across data sources. R dates and times justify care and attention.

Current Date/Time in R

The function date(), Sys.date() and Sys.time() all return a character string of the current system data and time:

Each of these functions returns a slightly different result, which raises the obvious question how best to manage and format dates in large data objects?

Classes: R Dates and Times

R provides several options for dealing with date and time data.

  • The as.Date() function handles dates (without times);
  • the package chron handles dates and times, but does not control for time zones; and
  • the POSIXct and POSIXlt classes are ISO compliant data objects that support date/times with time zones and assorted calendar adjustments.

The general rule for date/time data in R is to use the simplest technique possible:

  • For date only data, as.Date() will usually be the best choice.
  • If you need to handle dates and times, without timezone information, the chron library is a good choice;
  • the POSIX classes are especially useful when timezone manipulation is important and is the common format for SCADA data and atmospheric science applications.

Creating Data/Time Objects in R

POSIXct vs. POSIXlt in R

Both the POSIX classes give seconds since January 1, 1970 00:00 in UTC time.  The primary difference between class POSIXct and POSIXlt is that the former is a just numeric value (seconds) and the latter is a named list of vectors representing:

0–61: seconds

0–59: minutes

0–23: hours

1–31: day of the month

0–11: months after the first of the year.

years since 1900, even though the origin is defined to be 1970!

0–6 day of the week, starting on Sunday.

0–365: day of the year.

Daylight Savings Time flag. Positive if in force, zero if not, negative if unknown.

A convenient way to exploit this vector data to create POSIX compliant dates is the ISOdate() function:

Formatting Date/Times in R

Capabilities to format date/times is typical R…lots of flexibility!  In practice, date/time objects in R are manipulated in the same way they would be in a C program. The two most important functions in this regard are:

  • strptime() for formatting input dates, and
  • strftime() for formatting output dates.

Both of these functions use a variety of formatting codes, as listed in the table below.  For example, dates in many logfiles are printed in a format like “16/Oct/2005:07:51:00“. To create a POSIXct date from a date in this format, the following call to strptime() could be used:

For pretty printing, the format() function will recognize the class of your input date, and perform any necessary conversions before calling strftime(), so strftime() rarely needs to be called directly. For example:

All the available format codes are listed below:

%aAbbreviated weekday name in the current locale. Also matches full name on input.
%AFull weekday name in the current locale. Also matches the abbreviated name on input.
%bAbbreviated month name in the current locale. Also matches the full name on input.
%BFull month name in the current locale. Also matches the abbreviated name on input.
%cDate and time. Locale-specific on output. "%a %b %e %H:%M:%S %Y on input.
%dDay of month as decimal number (01-31).
%HHours as decimal number (01-23).
%IHours as decimal number (01-12).
%jDay of year as decimal number (001-366).
%mMonth as decimal number (01-12).
%MMinute as decimal number (00-59).
%pAM/PM indicator in the locale. Used in conjunction with %I and not with %H. An empty string in some locales.
%SSecond as decimal number (00-61), allowing for up to two leap-seconds (POSICX-compliant implementations ignore leap seconds).
%UWeek of the year as decimal number (00-53).
%wweekday as a decimal number (0-6, Sunday is 0).
%WWeek of the year as a decimal number (00-53) using Monday as the first day of the week.
%xDate. Local-specific on output. %y/%m/%d on input.
%XTime. Local-specific on output. %H:%M:%S on input.
%yYear without century (00-99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19 2008 POSIX standard; this could change)
%YYear with century.Note: the Gregorian calendar assumes no zero year (ISO 8601:2004 defines as 1BC).
%zSigned offset in hours and minutes from UTC (+0300) is 3 hours before UTC.
%ZOutput only. tzone as a character string (empty if not available)

Some Common Date/Time Manipulations

The individual components of a POSIX date/time object can be extracted by first converting to POSIXlt if necessary, and then accessing the components directly:

Many of the statistical summary functions, like meanminmax, etc are able to handle date objects. For example:

Once the dates are properly read into R, a variety of calculations can be performed:

If two times are subtracted, R will return the results in the form of a time difference, which represents a difftime object. For example, New York City experienced a major blackout on July 13, 1997, and another on August 14, 2003. To calculate the time interval between the two blackouts, we can simply subtract the two dates, using any of the classes that have been introduced:

If an alternative unit of time was desired, the difftime() function could be called, using the optional units= argument can be used with any of the following values: “auto“, “secs“, “mins“, “hours“, “days“, or “weeks“. So to see the difference between blackouts in terms of weeks, we can use:

difftime() values can be manipulated like ordinary numeric variables; arithmetic performed with these values will retain the original units.

Date/Time Sequences

The by= argument to the seq() function can be specified either as a difftime() value, or in any units of time that the difftime function accepts, making it very easy to generate sequences of dates.  For example, to generate a vector of ten dates, starting on July 4, 1976, we could use:

Back | Next