dplyr slice_tail() in R: Take the Last N Rows (Per Group)
The slice_tail() function in dplyr returns the LAST n (or fraction) rows of a data frame, optionally per group. It is the dplyr-pipe-friendly replacement for base R tail().
slice_tail(df, n = 5) # last 5 rows slice_tail(df, prop = 0.1) # last 10% of rows df |> group_by(g) |> slice_tail(n = 3) # 3 per group from end slice_tail(df, n = 3, by = cyl) # per group via .by (dplyr 1.1+) df |> arrange(x) |> slice_tail(n = 5) # largest by x (after sort) tail(df, 5) # base R alternative
Need explanation? Read on for examples and pitfalls.
What slice_tail() does in one sentence
slice_tail(.data, n = X) keeps the last X rows of a tibble; on a grouped tibble it keeps the last X per group. Use prop = 0.1 instead of n to take a fraction.
This is dplyr's pipe-friendly answer to tail(). The big advantage: it respects group_by() automatically and integrates cleanly with the rest of the dplyr verb set.
Syntax
slice_tail(.data, n = NULL, prop = NULL, by = NULL). Pass n OR prop, not both.
slice_tail(n = 3) returns 3 rows PER GROUP. Three groups means 9 rows total, not 3. Ungroup first if you want a global tail.Five common patterns
1. Last n rows of an ungrouped data frame
Equivalent to tail(mtcars, 10), but pipeline-friendly.
2. Last n rows per group
Returns 3 rows per cyl group (9 total).
3. Use .by for one-step grouping
by = cyl scopes grouping to this verb only.
4. Last fraction with prop
Returns ~3 rows (10% of 32).
5. Latest record per group (sorted timeline)
A common time-series idiom: get the latest event per user.
slice_tail() cares about ROW ORDER, not value rank. "Last n rows" means physically last in the current sort. To get "highest n values", use slice_max() instead: it sorts internally and is semantically clearer for ranking.slice_tail() vs tail() vs slice_max() vs slice_sample()
Four R functions for "rows from the end" or "extreme values".
| Function | Returns | Per group? | Best for |
|---|---|---|---|
slice_tail(n) |
Last n rows (positional) | Yes | dplyr pipelines |
base::tail(n) |
Last n rows | No | Quick interactive use |
slice_max(col, n) |
Top n by column value | Yes | "Top n by metric" |
slice_sample(n) |
Random n rows | Yes | Random sampling |
When to use which:
slice_tailwhen row order is meaningful (e.g., latest in chronological order afterarrange()).slice_maxwhen you want "top N by metric" without sorting first.tailfor quick base-R inspection.slice_samplefor randomness.
A practical workflow
The classic "latest per group" pattern uses arrange + group_by + slice_tail.
This returns the most recent record per group. Equivalent: slice_max(timestamp, n = 1, by = group_var) for newer dplyr.
Common pitfalls
Pitfall 1: per-group surprise. mtcars |> group_by(cyl) |> slice_tail(n = 5) returns 15 rows. Always ungroup downstream if you assumed a global tail.
Pitfall 2: implicit row order. slice_tail returns whatever rows happen to be last in current order. If the data is unsorted, the "tail" is arbitrary. Always arrange() first when order matters.
n = 5. No warning is issued.Try it yourself
Try it: Get the 2 cars with the HIGHEST mpg per cyl group using slice_tail. Save to ex_top_mpg.
Click to reveal solution
Explanation: arrange(mpg) sorts ascending; slice_tail(n = 2) per group picks the 2 highest. slice_max(mpg, n = 2) does both internally.
Related slice functions
After mastering slice_tail, look at:
slice_head(): first n rows (per group)slice_max()/slice_min(): top/bottom n by column valueslice_sample(): random n rowsslice(): specific positional rowslast(): last value of a vector (not data frame)tail(): base R alternative
For "latest record per group", slice_max(timestamp, n = 1, by = group) is the cleanest pattern in modern dplyr.
FAQ
What is the difference between slice_tail and tail in R?
tail(df, n) is base R: returns last n rows of the WHOLE frame, ignores grouping. slice_tail(df, n) is dplyr: respects group_by() and works inside pipelines.
How do I get the last n rows per group in dplyr?
df |> group_by(g) |> slice_tail(n = 3) returns the last 3 rows per group. Or slice_tail(df, n = 3, by = g) in dplyr 1.1+.
What is the difference between slice_tail and slice_max?
slice_tail(n) returns the LAST n rows in current order. slice_max(col, n) returns the n highest BY a column value (sorts internally). Use slice_max when ranking is the criterion.
How do I get the latest record per group?
Sort by timestamp, group, take the tail of size 1: df |> arrange(ts) |> group_by(g) |> slice_tail(n = 1). Or directly: slice_max(df, ts, n = 1, by = g).
Why did I get more rows than I asked for?
Because the data frame was grouped. slice_tail(n = 5) on a grouped df returns 5 rows PER GROUP. Ungroup first if you want 5 total.