dplyr last() in R: Get the Last Value of a Vector
The last() function in dplyr returns the LAST element of a vector, with optional default for empty input and order_by for sorting. It is the cleaner alternative to x[length(x)] inside dplyr pipelines.
last(c(10, 20, 30)) # 30 last(c()) # default = NA last(x, default = 0) # specify fallback last(x, order_by = ts) # latest by ts df |> summarise(last_val = last(value)) df |> group_by(g) |> summarise(last_val = last(value, order_by = ts))
Need explanation? Read on for examples and pitfalls.
What last() does in one sentence
last(x, default = NA, order_by = NULL) returns the last element of x; on empty input it returns default instead of erroring. With order_by, x is reordered by that vector before extracting the last position.
The dplyr-friendly version of x[length(x)], with safer empty-input handling.
Syntax
last(x, default = NA, order_by = NULL). Default is NA of x's type.
last(x, order_by = ts) for "latest by timestamp" semantics. Without order_by, "last" is whatever happens to be physically last in the input.Five common patterns
1. Last element
2. With a default
3. Per-group last
4. Last by another column's order
order_by = ts sorts by ts ascending; last() takes the position with the LARGEST ts.
5. Last non-NA
last(x, order_by = ts) is the dplyr idiom for "latest record". Combined with group_by, it answers "what was each user's most recent X?" in one summarise step. Without order_by, "last" means physical row order, which is rarely what you want.last() vs tail() vs slice_tail() vs nth()
Four "last element" functions in R, with different return shapes.
| Function | Input | Output | Best for |
|---|---|---|---|
last(x) |
Vector | Scalar | dplyr summarise / mutate |
tail(x, 1) |
Vector | Length-1 vector | Quick base R |
tail(df, 1) |
Data frame | 1-row df | Last row of a frame |
slice_tail(df, n = 1) |
Data frame | 1-row tibble | dplyr; group-aware |
nth(x, -1) |
Vector | Scalar | Pick by negative index |
When to use which:
last(x)for scalar output inside summarise.slice_tailfor row-level extraction.nth(x, -1)is equivalent (negative index = from end).
A practical workflow
The "latest per group" pattern is last's killer use case.
Per user, the chronologically latest visit and action. Equivalent to slice_max(timestamp, n = 1, by = user) for getting the WHOLE row; last() is for scalar values inside summarise.
For multi-stat per-group with first AND last:
Common pitfalls
Pitfall 1: order_by silent without sorting. Without order_by, last() uses physical row order. The latest by timestamp might NOT be the physically last row.
Pitfall 2: confusing last() with slice_tail(). last returns a scalar; slice_tail returns a tibble. Pick by what shape you need downstream.
last() is sensitive to input order in ways that are easy to overlook. Always pass order_by = sort_col for time-series queries, even if you "just sorted" upstream: being explicit prevents bugs from later refactors.Why "latest per group" needs order_by
last() without order_by returns whatever happens to be physically last in the input. This is fine if your data is already sorted by time, but it is fragile: any upstream change that re-orders the rows silently changes the result. last(val, order_by = ts) is robust: dplyr explicitly sorts by ts before picking the last value. The cost is small (one sort per group); the benefit is that the result depends only on the data's semantics, not its loading order. In production pipelines, prefer the explicit form.
Try it yourself
Try it: For each cyl group in mtcars, get the mpg of the LAST car (chronological by row order). Save to ex_last.
Click to reveal solution
Explanation: last(mpg) per cyl group returns the mpg of the last physical row in each group.
Related dplyr functions
After mastering last, look at:
first(): first value (mirror)nth(x, k): arbitrary position;nth(x, -1)equals last(x)slice_tail()/slice_head(): row-level versionstail()/head(): base Rslice_max()/slice_min(): top n by column value (often a cleaner equivalent)
For "latest record per group", slice_max(timestamp, n = 1, by = group) is often cleaner than last + summarise.
FAQ
What does last do in dplyr?
last(x) returns the last element of a vector as a scalar. Inside summarise / mutate, with optional default for empty input and order_by for sorting.
What is the difference between last() and tail() in R?
last(x) returns a scalar. tail(x, 1) returns a length-1 vector. Different shape, same value. For data frames, tail(df, 1) returns a 1-row data frame; last() doesn't apply directly.
How do I get the latest record per group with last()?
df |> group_by(g) |> summarise(latest = last(val, order_by = ts)). Without order_by, last uses physical row order; with it, last picks the row with the maximum ts.
How do I get the latest non-NA value with last()?
last(na.omit(x)) drops NAs first. Or last(x, order_by = ts, default = NA) if you want explicit handling.
Should I use last() or slice_max()?
last() returns a scalar inside summarise. slice_max(col, n = 1) returns the WHOLE ROW as a tibble. Pick based on what you need downstream.