lubridate ymd() in R: Parse Dates From Strings
The ymd() function in lubridate parses date strings of the form "year-month-day" into proper Date objects. Sister functions mdy(), dmy(), ymd_hms() handle other orderings and date-times.
ymd("2024-01-15") # year-month-day
mdy("01/15/2024") # month-day-year
dmy("15-01-2024") # day-month-year
ymd_hms("2024-01-15 14:30:00") # date + time
ymd_hm("2024-01-15 14:30") # date + time (no seconds)
ymd_hms(x, tz = "America/New_York") # with timezone
parse_date_time(x, orders = c("ymd","mdy"))# multi-formatNeed explanation? Read on for examples and pitfalls.
What ymd() does in one sentence
ymd("2024-01-15") parses the input string and returns a Date object. lubridate auto-detects common separators (-, /, ., ) and is flexible about month and day padding (1 or 01 both work).
This is the simplest, most common date parsing function in R. Use it whenever you have a string that represents a year-first date.
Syntax
ymd(x, tz = NULL, locale = ...). x can be a single string, vector, or factor.
as.Date(). as.Date("01/15/2024") errors without a format string. mdy("01/15/2024") just works. Use lubridate for parsing; reserve base R for cases where you absolutely need zero dependencies.Five common patterns
1. Year-first date
2. Different orderings
mdy() for US-style; dmy() for European-style. Both accept text or numeric months.
3. Date + time
ymd_hms (with hms variants for partial times) returns a POSIXct object instead of Date.
4. With timezone
tz = "America/New_York" parses the time as Eastern time. Default is UTC. Use OlsonNames() to list available timezones.
5. Multiple possible formats
parse_date_time(x, orders = c(...)) tries each order until one succeeds. Useful for messy data with mixed formats.
ymd("2024.01.15"), ymd("2024-1-15"), ymd("20240115") all return the same result. This robustness is intentional: real-world dates come in countless variations and you should not have to guess separators.ymd() vs as.Date() vs strptime() vs anytime
Four parsing functions with different trade-offs between convenience and control.
| Function | Package | Format detection | Best for |
|---|---|---|---|
ymd() / mdy() / dmy() |
lubridate | Auto | Most cases; quick parse |
as.Date() |
base R | Manual format= |
Strict / known format |
strptime() |
base R | Manual format | POSIXlt with timezone |
anytime::anytime() |
anytime | Auto, very loose | Truly chaotic mixed inputs |
For everyday work, lubridate's parsers are the default. Reach for base R when you want a strict format and zero dependencies.
Common pitfalls
Pitfall 1: parsing fails silently with NA. ymd("not a date") returns NA with a warning. Always check sum(is.na(parsed)) after parsing to spot bad inputs.
Pitfall 2: ambiguous formats. dmy("01/02/2024") returns Feb 1; mdy("01/02/2024") returns Jan 2. Pick the parser that matches the data's convention.
ymd("44927") errors. For Excel serial dates: as.Date(44927, origin = "1899-12-30").A practical lubridate workflow
Most real-world date pipelines follow a similar shape: read raw strings, parse, validate, extract components, compute durations.
The five steps:
- Read raw data as character vectors (CSV, JSON, web scrape).
- Parse to Date / POSIXct with
ymd(),mdy(), orparse_date_time()for mixed inputs. - Validate that none became NA:
stopifnot(!any(is.na(parsed))). - Extract components with
year(),month(),day(),wday()for grouping or filtering. - Compute durations with
as.period(),difftime(), or interval arithmetic.
Following this pattern keeps date logic explicit and catches malformed inputs early. The single biggest source of date bugs in R is silent NA propagation from a failed parse. Always check for NAs immediately after parsing.
When working in dplyr pipelines, lubridate parsers chain naturally:
Try it yourself
Try it: Parse the dates "Jan 15 2024", "March 20 2024", "July 4 2024" using mdy. Save to ex_dates.
Click to reveal solution
Explanation: mdy() handles month-name input as well as numeric. lubridate auto-detects "Jan" vs "January" and parses both. The result is a Date vector.
Related lubridate functions
After mastering ymd, look at:
mdy(),dmy(): other ordering variantsymd_hms(),ymd_hm(),ymd_h(): with time componentsparse_date_time(): most flexible, accepts multiple ordersyear(),month(),day(): extract componentsfloor_date(),ceiling_date(): round datesnow(),today(): current datetime / date
For date arithmetic, lubridate's days(7), months(1), years(1) are friendlier than base R's seq.Date.
FAQ
How do I convert a string to a Date in R?
Use lubridate::ymd("2024-01-15") for year-first strings. Use mdy() or dmy() for other orderings. lubridate handles common separators automatically.
What is the difference between ymd and as.Date?
Both convert strings to Dates. ymd() (lubridate) is forgiving about separators and ordering. as.Date() (base R) requires a format argument unless the input is exactly "YYYY-MM-DD". For most use cases, lubridate is easier.
How do I parse a date with time in R?
Use ymd_hms() for "year month day hour minute second": ymd_hms("2024-01-15 14:30:00"). Variants ymd_hm() and ymd_h() handle partial times.
How do I handle timezones with lubridate?
Pass tz = "America/New_York" (or another) to ymd_hms(). Use with_tz() to convert to a different timezone after parsing. List zones with OlsonNames().
How do I parse Excel dates in R?
Excel stores dates as serial numbers. Convert with as.Date(44927, origin = "1899-12-30"). The origin shifts depending on Excel's date system. Modern Excel uses 1899-12-30 as origin.