This tutorial makes use of the following R package(s): lubridate

Date values can be represented in tables as numbers or characters. But to be properly interpreted by R as dates, date values should be converted to an R date object class or a POSIXct/POSIXt object class. R provides many facilities to convert and manipulate dates and times, but a package called lubridate makes working with dates/times much easier.

Creating date/time objects

From complete date strings

You can convert many representations of date and time to date objects. For example, let’s create a vector of dates represented as month/day/year character strings,

x <- c("06/23/2013", "06/30/2013", "07/12/2014")
class(x)
[1] "character"

At this point, R treats the vector x as characters. To force R to interpret these as dates, use lubridate’s mdy function. mdy will convert date strings where the date elements are ordered as month, day and year.

library(lubridate)
x.date <- mdy(x)
class(x.date)
[1] "Date"

Note that using the mode or typeof functions will not help us determine if the object is an R date object. This is because a date is stored as a numeric (double) internally. Use the class function instead as shown in the above code chunk.

If you need to specify the time zone, add the parameter tz=. For example, to specify Eastern Standard Time, type:

x.date <- mdy(x, tz="EST")
x.date
[1] "2013-06-23 EST" "2013-06-30 EST" "2014-07-12 EST"

R does not maintain its own list of timezone names, instead, it relies on the operating system’s naming convention. To list the supported timezone names for your particular R environment, type:

OlsonNames()

For example, to select Daylight Savings Time type tz="EST5DST".

Adding the timezone converts the R date object into another date object class called POSIXt or POSIXct.

class(x.date)
[1] "POSIXct" "POSIXt" 

Turning our attention back to the function mdy, note that it can read in date formats that use different delimiters so that mdy("06/23/2013"), mdy("06-23-2013") and mdy("06.23.2013") are parsed exactly the same so long as the order remains month/day/year.

For different month/day/year arrangements, other lubridate functions need to be used:

Functions Date Format
dmy() day/month/year
ymd() year/month/day
ydm() year/day/month

If your data contains both date and time in a “month/day/year hour:minutes:seconds” format use the mdy_hms function.

x <- c("06/23/2013 03:45:23", "07/30/2013 14:23:00", "08/12/2014 18:01:59")
x.date <- mdy_hms(x, tz="EST")
x.date
[1] "2013-06-23 03:45:23 EST" "2013-07-30 14:23:00 EST" "2014-08-12 18:01:59 EST"

The characters _h, _hm or _hms can be appended to any of the four date function names described earlier to accommodate time formats. A few examples follow:

mdy_h("6/23/2013 3", tz="EST") 
[1] "2013-06-23 03:00:00 EST"
dmy_hm("23/6/2013 3:15", tz="EST5EDT")
[1] "2013-06-23 03:15:00 EDT"
ymd_hms("2013/06/23 3:15:7", tz="UTC")
[1] "2013-06-23 03:15:07 UTC"

Note that adding a time element to the date object will also create POSIXct and POSIXt object classes.

class(x.date)
[1] "POSIXct" "POSIXt" 

From separate date elements

If your data table splits the date elements into separate vector objects or columns, use the paste function to combine the elements into a single date string before passing it to one of the lubridate functions. Let’s look at an example:

dat1 <- read.csv("http://mgimond.github.io/ES218/Data/CO2.csv")
head(dat1)
  Year Month Average Interpolated  Trend Daily_mean
1 1959     1  315.62       315.62 315.70         -1
2 1959     2  316.38       316.38 315.88         -1
3 1959     3  316.71       316.71 315.62         -1
4 1959     4  317.72       317.72 315.56         -1
5 1959     5  318.29       318.29 315.50         -1
6 1959     6  318.15       318.15 315.92         -1

The CO2 dataset has the date split across two columns: Year and Month (both loaded as integers). You can combine the columns into a character string using the paste function. For example, if we want to create a “Year-Month” string as in 1959-10, we could type:

paste(dat1$Year,dat1$Month, sep="-")

The above example uses three arguments: the two objects that are pasted together (i.e. Year and Month) and the sep="-" parameter which tells R to fill the gap between both objects with a dash - (by default, paste would have added spaces thus creating strings in the form of 1959 10).

lubridate does not have a function along the lines of ym to only convert year-month strings, this requires that we add an artificial day of the month to the string. We’ll choose to add the 15th day of the month as in

paste(dat1$Year, dat1$Month, "15", sep="-")

And finally, we’ll add a new column called Date to the dat object, and fill that column with the newly created date string wrapped with the ymd function:

dat1$Date <- ymd( paste(dat1$Year, dat1$Month, "15", sep="-") )
head(dat1)
  Year Month Average Interpolated  Trend Daily_mean       Date
1 1959     1  315.62       315.62 315.70         -1 1959-01-15
2 1959     2  316.38       316.38 315.88         -1 1959-02-15
3 1959     3  316.71       316.71 315.62         -1 1959-03-15
4 1959     4  317.72       317.72 315.56         -1 1959-04-15
5 1959     5  318.29       318.29 315.50         -1 1959-05-15
6 1959     6  318.15       318.15 315.92         -1 1959-06-15

To confirm that the Date column is indeed formatted as a date object type:

str(dat1)
'data.frame':   671 obs. of  7 variables:
 $ Year        : int  1959 1959 1959 1959 1959 1959 1959 1959 1959 1959 ...
 $ Month       : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Average     : num  316 316 317 318 318 ...
 $ Interpolated: num  316 316 317 318 318 ...
 $ Trend       : num  316 316 316 316 316 ...
 $ Daily_mean  : int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
 $ Date        : Date, format: "1959-01-15" "1959-02-15" "1959-03-15" "1959-04-15" ...

or you could type,

class(dat1$Date)
[1] "Date"

Note the Date class designation (as opposed to the POSIX... class) since we did not add a timezone or time component to the date object.

Extracting date information

If you want to get the day of the week from a date vector, use the wday function.

wday(x.date)
[1] 1 3 3

If you want the day of the week displayed as its three letter designation, add the label=TRUE parameter.

wday(x.date, label=TRUE)
[1] Sun  Tues Tues
Levels: Sun < Mon < Tues < Wed < Thurs < Fri < Sat

You’ll note that this last call not only returns the day of the week for each element (Sun, Tues, Tues) but it also returns levels information along with its hierarchy.

The following table lists functions that extract different elements of a date object.

Functions Extracted element
hour() Hour of the day
minute() Minute of the hour
day() Day of the month
yday() Day of the year
month() Month of the year
year() Year
tz() Time zone

Operating on dates

You can apply certain operations to dates as you would to numbers. For example, to list the number of days between the first and third elements of the vector x.date type the following:

(x.date[3] - x.date[1]) / ddays()
[1] 415.5949

To get the number of weeks between both dates:

(x.date[3] - x.date[1]) / dweeks()
[1] 59.37069

Likewise, you can get the number of minutes between dates by dividing by dminutes() and the number of years by dividing by dyears().

You can also apply Boolean operations on dates. For example, to find which date element in x.date falls between the 11th and 24th day of any month, type:

(mday(x.date) > 11) & (mday(x.date) < 24)
[1]  TRUE FALSE  TRUE

If you want the command to return just the dates that satisfy this query, pass the Boolean operation as an index to the x.date vector:

x.date[ (mday(x.date) > 11) & (mday(x.date) < 24) ]
[1] "2013-06-23 03:45:23 EST" "2014-08-12 18:01:59 EST"

Formatting date objects

You can create a character vector from a date object. This is useful if you want to annotate plots with dates or include date values in reports. For example, to convert the date object x.date to a “Month_name Year” character format, type the following:

as.character(x.date, format="%B %Y")
[1] "June 2013"   "July 2013"   "August 2014"

The format= parameter accepts many different date/time format codes listed in the following table (note the case!).

Format codes Description Example
%a Abbreviated weekday name Sun, Tue, Tue
%A Full weekday name Sunday, Tuesday, Tuesday
%m Month as decimal number 06, 07, 08
%b Abbreviated month name Jun, Jul, Aug
%B Full month name June, July, August
%c Date and time, locale-specific Sun Jun 23 03:45:23 2013, Tue Jul 30 14:23:00 2013, Tue Aug 12 18:01:59 2014
%d Day of the month as decimal number 23, 30, 12
%H Hours as decimal number (00 to 23) 03, 14, 18
%I Hours as decimal number (01 to 12) 03, 02, 06
%p AM/PM indicator in the locale AM, PM, PM
%j Day of year as decimal number 174, 211, 224
%M Minute as decimal number (00 to 59) 45, 23, 01
%S Second as decimal number 23, 00, 59
%U Week of the year starting on the first Sunday 25, 30, 32
%W Week of the year starting on the first Monday 24, 30, 32
%w Weekday as decimal number (Sunday = 0) 0, 2, 2
%x Date (locale-specific) 6/23/2013, 7/30/2013, 8/12/2014
%X Time (locale-specific) 3:45:23 AM, 2:23:00 PM, 6:01:59 PM
%Y 4-digit year 2013, 2013, 2014
%y 2-digit year 13, 13, 14
%Z Abbreviated time zone EST, EST, EST
%z Time zone -0500, -0500, -0500