lubridate mdy() in R: Parse Month-First Date Strings
The mdy() function in lubridate parses month-first date strings (like "01/15/2024" or "January 15, 2024") into R Date objects. It auto-detects common separators and accepts both numeric and text month names.
mdy("01/15/2024") # US slash format
mdy("January 15, 2024") # full month name
mdy("Jan 15 2024") # abbreviated month
mdy("01-15-2024") # dash separator
mdy(c("01/15/2024", "03/20/2024")) # vector input
mdy_hms("01/15/2024 14:30:00") # date plus time
mdy("01/15/2024", tz = "America/Chicago") # with timezoneNeed explanation? Read on for examples and pitfalls.
What mdy() does in one sentence
mdy("01/15/2024") reads a month-first string and returns a Date object. lubridate detects separators (/, -, ., space), accepts numeric or text month names, and is forgiving about padding (1 and 01 both work).
This parser exists because most American data sources write dates month-first. Spreadsheets, US government feeds, healthcare records, and many CRM exports default to this convention. Reach for mdy() whenever you trust the data follows that ordering.
Syntax
mdy(x, tz = NULL, locale = ..., quiet = FALSE). x is a character vector, factor, or numeric date.
Date objects, every downstream step (sorting, filtering, plotting, joining) works correctly. Leaving raw strings around forces every consumer to re-parse and risks inconsistent results across scripts.Five common patterns
1. US slash dates
The most common case. Output is always ISO-ordered ("2024-01-15") regardless of the input ordering.
2. Text month names
lubridate recognizes both full ("January") and three-letter ("Jan") month names. The comma after the day is optional.
3. Date plus time with mdy_hms
mdy_hms() returns a POSIXct (date plus time) instead of a Date. Use mdy_hm() for hour and minute only, mdy_h() for hour only.
4. Vector inputs with mixed separators
mdy() is vectorized. Mixed separators within the vector parse correctly, so messy CSV columns rarely need pre-cleaning.
5. Specify a US timezone
Without tz, lubridate defaults to UTC. Pass an Olson zone name to fix the interpretation. List all available zones with OlsonNames().
mdy() accepts slashes, dashes, dots, and spaces. The function name is purely a hint about which token is the month. This keeps the API small: three short verbs (mdy, ymd, dmy) cover every realistic ordering.mdy() vs ymd() vs dmy() vs parse_date_time()
Pick the parser whose name matches the ordering in your source data.
| Parser | Ordering | Example input |
|---|---|---|
mdy() |
Month, day, year | "01/15/2024", "Jan 15 2024" |
ymd() |
Year, month, day | "2024-01-15", "20240115" |
dmy() |
Day, month, year | "15/01/2024", "15-Jan-2024" |
parse_date_time() |
Multiple orderings | Mixed: tries each orders= option |
When you do not know the ordering or the data is genuinely mixed, fall back to parse_date_time(x, orders = c("mdy","dmy","ymd")). It tries each format in turn and uses the first that succeeds.
Common pitfalls
Pitfall 1: US-versus-EU ambiguity. mdy("01/02/2024") returns Jan 2; dmy("01/02/2024") returns Feb 1. The same string has two valid meanings. Confirm the source convention before picking the parser.
Pitfall 2: two-digit years. mdy("01/15/24") returns 2024, not 1924. lubridate applies a 30/70 cutoff: years "00" to "68" map to the 2000s, "69" to "99" map to the 1900s. Verify after parsing if your data crosses that boundary.
Pitfall 3: silent NA on parse failure. mdy("not a date") returns NA with a single warning. Always check sum(is.na(parsed)) to catch malformed rows before they propagate.
stopifnot(!any(is.na(parsed))) or filter and report bad rows. Date bugs almost always trace back to a silent NA at the parse step.A practical mdy() workflow
Most date pipelines that reach for mdy() follow the same five steps.
- Read the raw column as character (CSV imports often do this by default).
- Inspect a sample of 5 to 10 rows to confirm month-first ordering.
- Parse with
mdy()(ormdy_hms()for datetimes). - Validate with
sum(is.na(parsed))and inspect the failed rows. - Extract components with
year(),month(),day()for grouping or filtering.
In a dplyr pipeline the same logic compresses to a single mutate block:
Parsing inside mutate() keeps the raw string column around so you can spot-check any rows that failed to parse after the fact.
Try it yourself
Try it: Parse "May 1 2024", "06/15/2024", "Jul 4, 2024" with mdy(). Save the result to ex_mdy.
Click to reveal solution
Explanation: mdy() handles each variant in one call. Text months, numeric months, and mixed separators all parse correctly. The result is a Date vector.
Related lubridate functions
After mastering mdy(), the most useful neighbors are:
ymd(),dmy(): other order parsersmdy_hms(),mdy_hm(),mdy_h(): month-first datetimesparse_date_time(): multi-order fallback for messy inputsyear(),month(),day(): extract date componentsfloor_date(),ceiling_date(): round to day, week, or monthtoday(),now(): current date or datetime
For date arithmetic on the parsed result, see days(), months(), and years() for friendlier offsets than base R's seq.Date(). The full reference is at lubridate.tidyverse.org.
FAQ
How do I convert "01/15/2024" to a Date in R?
Use mdy("01/15/2024") from the lubridate package. It returns a proper Date object that you can sort, filter, plot, and join on. No format = argument is needed; the parser handles common separators automatically.
What is the difference between mdy and ymd in R?
The two parsers differ only in the expected token ordering. mdy() reads month-first input ("01/15/2024"); ymd() reads year-first input ("2024-01-15"). Both return the same Date output. Pick the one whose name matches your source data.
Why does mdy("01/02/2024") give January 2, not February 1?
mdy() assumes the first token is the month, so it reads "01" as January and "02" as the day. If your data is European day-first, use dmy() instead. When the convention is unclear, inspect 5 to 10 rows of source data before choosing a parser.
How do I parse "01/15/24" with a two-digit year?
mdy("01/15/24") works. lubridate applies a 30/70 cutoff: years "00" to "68" become 2000-2068, "69" to "99" become 1969-1999. If your data uses a different convention, prepend the century manually before parsing.
How do I handle parse failures from mdy()?
Bad inputs return NA with a warning. Run sum(is.na(mdy(x))) to count failures, then x[is.na(mdy(x))] to see the offending strings. Common causes: empty strings, day-first inputs misrouted to mdy(), or non-date placeholders like "N/A".