dplyr slice_max() in R: Top N Rows by Column Value
The slice_max() function in dplyr returns the rows with the LARGEST values of a specified column, optionally per group. It is the modern replacement for the deprecated top_n().
slice_max(df, mpg, n = 5) # 5 highest by mpg slice_max(df, mpg, prop = 0.1) # top 10% slice_max(df, mpg, n = 3, by = cyl) # top 3 per cyl slice_max(df, mpg, n = 3, with_ties = FALSE)# strict 3 (no ties) slice_max(df, mpg, n = -3) # all but top 3 arrange(df, desc(mpg)) |> slice_head(n = 5) # equivalent older form
Need explanation? Read on for examples and pitfalls.
What slice_max() does in one sentence
slice_max(.data, order_by, n) sorts by order_by descending and returns the top n rows. On a grouped tibble (or with by = g), it returns the top n per group.
This is the cleanest way to answer "top N by metric" questions. It supersedes the older top_n() (deprecated in dplyr 1.0).
Syntax
slice_max(.data, order_by, n = NULL, prop = NULL, by = NULL, with_ties = TRUE, na_rm = FALSE).
slice_max() is more explicit than arrange(desc(x)) |> head(n). It says exactly what it does: "top n by x". Reach for it whenever you mean "top N by metric".Five common patterns
1. Top n by a column
By default, ties are included (you may get more rows than n).
2. Top n per group
by = cyl scopes to each cyl group; one row per group.
3. Top fraction (prop)
10% of 32 = 3 rows (or more with ties).
4. Strict top n (no ties)
with_ties = FALSE returns exactly n rows; arbitrary among ties.
5. Latest record per group (timestamp)
The canonical "latest per group" idiom in modern dplyr.
slice_max() replaces THREE older patterns: top_n() (deprecated), arrange(desc(x)) |> head(n), and arrange(desc(x)) |> slice_head(n). It is more readable, faster, and group-aware. For new code, always prefer slice_max for "top N by metric".slice_max() vs top_n() vs arrange + slice_head vs slice_min
Four ways to grab "top n by column" in R, with different ergonomics.
| Function | Sorts | Per group | Status | |
|---|---|---|---|---|
slice_max(col, n) |
Yes (desc) | Yes | Recommended | |
slice_min(col, n) |
Yes (asc) | Yes | Recommended (mirror) | |
| `arrange(desc(col)) | > slice_head(n)` | Yes | Yes if grouped | Verbose, equivalent |
top_n(n, col) |
No | Yes | Deprecated in dplyr 1.0 |
When to use which:
slice_maxfor top-by-metric.slice_minfor bottom-by-metric.arrange + slice_headonly if you also need the full sorted intermediate state.- Avoid
top_nin new code.
A practical slice_max workflow
The "top N per group" pattern is the most common slice_max use case in real pipelines.
Common variations:
- Top 1 per group →
slice_max(metric, n = 1, by = g)("best of each") - Top 5 by metric overall →
slice_max(metric, n = 5)("leaderboard") - Highest-priced item per category →
slice_max(price, n = 1, by = category) - Most recent record per user →
slice_max(timestamp, n = 1, by = user)
The pattern is so common it deserves its own one-liner. slice_max(df, col, n, by) IS that one-liner.
Common pitfalls
Pitfall 1: ties expand the result. slice_max(mpg, n = 3) returns 4 rows if two are tied at rank 3. Use with_ties = FALSE for strict n.
Pitfall 2: NAs sort to the bottom of descending order. slice_max excludes NAs in order_by only if na_rm = TRUE. Default behavior may surprise.
slice_max() differs from arrange(): it does NOT keep the sorted order in the output. Output rows may appear in original order. To get the sorted output, chain with arrange(desc(col)).Try it yourself
Try it: Find the car with the highest hp for each gear value. Save to ex_top_hp.
Click to reveal solution
Explanation: slice_max(hp, n = 1, by = gear) finds the highest-hp row for each unique value of gear. With ties, more rows may appear; add with_ties = FALSE for strict 1-per-group.
Related slice functions
After mastering slice_max, look at:
slice_min(): bottom n by column (mirror of slice_max)slice_head()/slice_tail(): first/last n by row orderslice_sample(): random n rowsslice(): specific row indexesarrange(): sort the entire frametop_n(): deprecated; do not use
For "lowest n by metric", slice_min(col, n) is the direct counterpart.
FAQ
What is the difference between slice_max and top_n in dplyr?
top_n() is deprecated since dplyr 1.0. slice_max() is the replacement: clearer name, supports prop, by, and with_ties arguments.
How does slice_max handle ties?
By default with_ties = TRUE includes all tied rows, which may return more than n rows. Set with_ties = FALSE for exactly n rows.
How do I get the top n per group?
Pass by = group_col (dplyr 1.1+): slice_max(df, col, n = 3, by = g). Or df |> group_by(g) |> slice_max(col, n = 3) |> ungroup().
What is the difference between slice_max and slice_head?
slice_max(col, n) sorts by COLUMN VALUE descending and takes top n. slice_head(n) takes the FIRST n rows IN CURRENT ORDER, ignoring values. Use slice_max when ranking is the criterion.
How does slice_max handle NA values?
NAs go to the bottom of descending order, so they are not picked unless n exceeds the count of non-NAs. Set na_rm = TRUE to exclude NAs explicitly.