lubridate fast_strptime() in R: Fast Format-Specific Parser
lubridate fast_strptime() parses date-time strings using a fast C parser when every row follows the same explicit format. It accepts strptime()-style format codes and runs several times faster than ymd_hms() on long vectors.
fast_strptime(x, format = "%Y-%m-%d") # date only fast_strptime(x, format = "%Y-%m-%d %H:%M:%S") # date and time fast_strptime(x, format = "%d/%m/%Y") # European order fast_strptime(x, format = "%Y%m%d") # compact, no separators fast_strptime(x, format = "%Y-%m-%d", tz = "UTC") # set timezone fast_strptime(x, format = "%Y-%m-%d", lt = FALSE) # return POSIXct fast_strptime(x, format = c("%Y-%m-%d","%d/%m/%Y")) # try multiple formats
Need explanation? Read on for examples and pitfalls.
What fast_strptime() does in one sentence
fast_strptime(x, format) parses character input through a C-level numeric parser when the format is fixed and known. It is the lubridate function for performance-critical pipelines where the input layout never changes.
Unlike parse_date_time() it does no format guessing. Unlike ymd() and friends it does not assume a token order. You give it the exact strptime() format string and it walks the input once.
Syntax and arguments
fast_strptime() takes five arguments and the only required ones are x and format. Every other argument has a sensible default.
| Argument | Default | Purpose |
|---|---|---|
x |
(required) | Character or numeric vector of date-time strings |
format |
(required) | One or more strptime() format strings |
tz |
"UTC" |
Timezone for the returned object |
lt |
TRUE |
If TRUE returns POSIXlt, if FALSE returns POSIXct |
cutoff_2000 |
68L |
Two-digit years at or below cutoff become 20xx, otherwise 19xx |
The format string mirrors base::strptime(): every literal character must appear in the input and every datetime piece must use a %-prefixed code (%Y, %m, %d, %H, %M, %S). Whitespace inside the format matches one or more whitespace characters.
NA. That strictness is the source of its speed and the source of its sharp edges.Format codes you will use most often
Five letter codes cover the vast majority of real input. The full list mirrors ?strptime, but these are the workhorses.
| Code | Meaning | Example match |
|---|---|---|
%Y |
Four-digit year | 2026 |
%m |
Two-digit month (01-12) | 01 |
%d |
Two-digit day (01-31) | 15 |
%H |
Hour, 24-hour (00-23) | 14 |
%M |
Minute (00-59) | 30 |
%S |
Second (00-59) | 45 |
%y |
Two-digit year | 26 |
%I |
Hour, 12-hour (01-12) | 02 |
%p |
AM or PM marker | PM |
%z |
UTC offset | -0500 |
Compact formats with no separators work as long as each field has the expected width. "20260115" parses cleanly because %Y consumes 4 digits and %m, %d consume 2 each.
Setting a timezone at parse time
The tz argument controls how the parser interprets the input, not how it converts it. A string like "2026-06-01 09:00:00" with tz = "America/New_York" is read as 9:00 AM Eastern, not as UTC then converted.
Default tz = "UTC" is the safe choice for storage and arithmetic. Switch only when the source explicitly produces local time. To convert an already-parsed value to a different zone use with_tz().
POSIXlt vs POSIXct: choosing the return type
lt = TRUE (the default) returns POSIXlt; lt = FALSE returns POSIXct. POSIXct is the friendlier type for data frames and most downstream code.
POSIXlt stores each component (year, month, day, hour, ...) as a list and breaks when assigned to a tibble or data.table column. POSIXct stores seconds since epoch as a numeric and round-trips cleanly. Set lt = FALSE whenever the result enters a column.
lt = FALSE in scripts. The POSIXlt default exists for compatibility with base::strptime(), but every modern pipeline (dplyr, data.table, ggplot2) prefers POSIXct. Make lt = FALSE your team standard.Trying multiple formats
Pass a character vector to format and the parser tries each in order. This is fast_strptime's only concession to mixed input. Each row stops at the first format that succeeds.
Order matters: the leftmost format wins on ambiguous strings, so list the strict format before the loose one to avoid silent misreads. For more than five candidates or fully unknown formats, switch back to parse_date_time(orders = ...).
When fast_strptime is worth reaching for
The speed advantage shows up at scale, not on a handful of rows. On a 10-row vector ymd_hms() and fast_strptime() finish in microseconds either way. The gap matters when the input is large or the parsing happens inside a tight loop.
Three workloads where the switch pays off:
- Log ingestion. A 1 million-row log file with one fixed timestamp format parses in under a second versus several seconds with
ymd_hms(). - Streaming pipelines. When a parser sits inside a function called per chunk, fast_strptime amortizes a single C call across the whole vector.
- Tick data prep. High-frequency time series often parse millions of timestamps before any modeling begins.
For one-off interactive work or vectors under a few thousand rows, prefer ymd_hms() for its forgiveness.
fast_strptime vs strptime vs parse_date_time
Three parsers, three sweet spots. The decision is about format certainty and vector length.
| Function | When it wins | When it loses |
|---|---|---|
fast_strptime() |
Fixed format, long vectors (10k+ rows) | Format varies row to row |
base::strptime() |
Short vectors, no extra dependency | 5-10x slower on long vectors |
parse_date_time() |
Mixed or unknown formats | Slowest of the three on uniform input |
For the canonical reference on every parameter, see lubridate's parse_date_time documentation.
Common pitfalls
Pitfall 1: silent NA on format mismatch. A single deviating character returns NA with a warning that is easy to miss inside a pipeline. Always check sum(is.na(result)) after parsing or wrap the call in stopifnot(!anyNA(result)) to fail loudly.
Pitfall 2: forgetting separators in the format. The format must include every literal character: dashes, slashes, spaces, colons. "%Y%m%d %H:%M:%S" will not parse "2026-01-15 14:30:45" because the dashes are missing from the format.
Pitfall 3: POSIXlt sneaking into a data frame. Default lt = TRUE returns a list-like object. Assigning it to a column yields a malformed structure or an error depending on the data frame variant. Always set lt = FALSE when the result enters tabular storage.
"2026-01-15 14:30:45.123" parsed with "%Y-%m-%d %H:%M:%OS" may drop the fraction depending on the lubridate version. Use parse_date_time2() or test with your version if milliseconds matter.Try it yourself
Try it: Parse the three timestamps below into a POSIXct vector using one fast_strptime call. The format is uniform. Save the result to ex_times.
Click to reveal solution
Explanation: The format string matches every literal character including the space and colons. lt = FALSE returns POSIXct so the result is safe to drop into a data frame.
Related lubridate functions
After mastering fast_strptime, look at:
parse_date_time(): multi-format parser when input layout variesparse_date_time2(): even faster path for ISO 8601-like inputymd(),mdy(),dmy(): convenience wrappers for one-off parsingymd_hms(),ymd_hm(): date plus time wrapperswith_tz(),force_tz(): convert or override a timezone after parsingas_datetime(): lightweight POSIXct constructorformat(): round-trip a parsed value back to a string
FAQ
What is the difference between fast_strptime and ymd_hms in R?
ymd_hms() infers the format on each call and accepts several layouts ("2026-01-15 14:30:45", "2026/01/15 14:30:45"). fast_strptime() skips inference and uses one explicit strptime()-style format you supply. On a fixed-format vector of 100,000 rows fast_strptime is roughly 5 to 10x faster, but it returns NA on any row that does not match the format.
Why does fast_strptime return POSIXlt by default?
The lt = TRUE default mirrors base::strptime() for backward compatibility. POSIXlt is a list of components and breaks when stored in tibbles or data.table columns. Most users should pass lt = FALSE to get POSIXct, which is the standard tidyverse and base R type for date-time columns.
How do I parse fractional seconds with fast_strptime?
Use %OS in the format string instead of %S. The result depends on the lubridate version and the options(digits.secs = N) setting. If subsecond precision is critical and the format is ISO 8601, prefer parse_date_time2() or the fasttime package, both of which preserve milliseconds reliably.
Can fast_strptime handle multiple input formats at once?
Yes, pass a character vector to format: fast_strptime(x, format = c("%Y-%m-%d", "%d/%m/%Y")). The parser tries each format in order for every element. For two or three known formats this is faster than parse_date_time(). For more candidates or unknown formats, use parse_date_time(x, orders = c(...)) instead.
Is fast_strptime faster than base R strptime?
Yes, by 5 to 10x on long vectors. base::strptime() is implemented in R-level C with per-call overhead that dominates short vectors but scales poorly past a few thousand rows. fast_strptime() calls a tighter C parser and processes the entire vector in one pass, so the gap widens as input grows.