dplyr row_number() in R: Assign Sequential Row Indexes

The row_number() function in dplyr returns sequential integer ranks 1, 2, 3, ... where TIES are broken by FIRST APPEARANCE. It is the most common ranking function in dplyr.

By Selva Prabhakaran · Published May 12, 2026 · Last updated May 12, 2026

⚡ Quick Answer

row_number()                            # 1, 2, 3, ... over current group
row_number(x)                           # rank by x (ties: first appearance)
row_number(desc(x))                     # rank by x descending
df |> mutate(rn = row_number())
df |> group_by(g) |> mutate(rn = row_number())
df |> filter(row_number() <= 3)         # top 3 per group (after sort)

Need explanation? Read on for examples and pitfalls.

📊 Is row_number() the right tool?

What row_number() does in one sentence

row_number() (no arg) returns 1, 2, 3, ... in row order; row_number(x) returns the rank of each element of x with ties broken by FIRST APPEARANCE. Inside group_by(), numbering restarts in each group.

The most common ranking function in dplyr. Use it when you want strictly increasing integers without tied ranks.

Syntax

row_number(x = NULL). With no arg, returns 1..n_rows. With an arg, ranks by that vector.

Run live

Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.

RSequential row numbers

library(dplyr) mtcars |> mutate(rn = row_number()) |> select(mpg, rn) |> head(3) #> mpg rn #> Mazda RX4 21.0 1 #> Mazda RX4 Wag 21.0 2 #> Datsun 710 22.8 3

Tip

Use row_number() (no arg) inside mutate to add a sequential id column. Combined with arrange(), it gives you a stable position index after sorting.

Five common patterns

1. Add a sequential index

RNumber every row 1..n

mtcars |> mutate(id = row_number()) |> head(3)

Sequential id, useful for joining or referencing rows.

2. Rank by a column

RRank by mpg ascending

df <- data.frame(name = c("a","b","c","d"), score = c(10, 20, 20, 5)) df |> mutate(rank = row_number(score)) #> name score rank #> 1 a 10 2 #> 2 b 20 3 #> 3 c 20 4 <-- tie broken by first appearance #> 4 d 5 1

Ties are broken by row order, NOT shared rank.

3. Rank descending

RTop performers first

df |> mutate(rank_desc = row_number(desc(score))) #> name score rank_desc #> 1 a 10 3 #> 2 b 20 1 #> 3 c 20 2 #> 4 d 5 4

4. Per-group row numbers

R1..k per group

df_g <- data.frame( user = c("a","a","a","b","b"), ts = 1:5 ) df_g |> group_by(user) |> mutate(visit_num = row_number()) #> # A tibble: 5 x 3 #> user ts visit_num #> a 1 1 #> a 2 2 #> a 3 3 #> b 4 1 #> b 5 2

User a's visits are numbered 1-3; user b's restart at 1.

5. Top n per group (filter)

RTop 2 highest mpg per cyl

mtcars |> group_by(cyl) |> arrange(desc(mpg)) |> filter(row_number() <= 2) |> ungroup()

Sort by mpg desc within each group, then keep rows where row_number is 1 or 2.

Key Insight

row_number() always produces UNIQUE integer ranks. No ties. Two rows with the same value get adjacent ranks (e.g., 3 and 4). For shared ranks on ties, use min_rank() (1, 2, 2, 4) or dense_rank() (1, 2, 2, 3) instead.

row_number() vs min_rank() vs dense_rank() vs rank()

Four ranking functions in R, with different tie-handling.

Function	Ties	Output for c(10, 20, 20, 5)
`row_number(x)`	Broken by row order	2, 3, 4, 1
`min_rank(x)`	Tied values share min rank	2, 3, 3, 1
`dense_rank(x)`	Tied values share rank, no gaps	2, 3, 3, 1 (same here)
`base::rank(x)`	Tied values get average	2, 3.5, 3.5, 1

When to use which:

row_number for unique sequential IDs.
min_rank for "leaderboard with ties".
dense_rank to avoid gaps after ties.
rank if you need average-tie behavior (rare in dplyr).

A practical workflow

The "top n per group" pattern is row_number's signature use case.

RTop 5 per category

df |> group_by(category) |> arrange(desc(score)) |> filter(row_number() <= 5) |> ungroup()

Top 5 by score in each category. Equivalent to slice_max(score, n = 5, by = category) in modern dplyr (1.1+); slice_max is cleaner.

For "first occurrence per group":

RFirst chronological row per user

df |> group_by(user) |> arrange(timestamp) |> filter(row_number() == 1) |> ungroup()

Same as slice_min(timestamp, n = 1, by = user).

Common pitfalls

Pitfall 1: forgetting arrange. row_number() is positional. Without arrange(), the numbers reflect whatever order rows happen to be in.

Pitfall 2: per-group reset can surprise. On a grouped tibble, row_number restarts at each group boundary. If you wanted a global row index, ungroup first or use mutate(id = 1:n()) outside group_by.

Warning

row_number(x) and row_number() differ semantically. With no arg, it numbers rows 1..n in current order. With an arg, it RANKS by that column. Easy to confuse.

Try it yourself

Try it: Rank mtcars cars by hp descending and keep only the top 3 PER cyl group. Save to ex_top3.

RYour turn: top 3 hp per cyl

ex_top3 <- mtcars |> # your code here ex_top3 #> Expected: 9 rows (3 per cyl group)

Click to reveal solution

RSolution

Explanation: Sort by hp desc within each cyl, keep rows where row_number <= 3. slice_max is the cleaner alternative in dplyr 1.1+.

After mastering row_number, look at:

min_rank(): ties share min rank
dense_rank(): ties share rank, no gaps
percent_rank() / cume_dist(): relative-position rankings
ntile(): split rows into n bins
slice_max() / slice_min(): top/bottom n by column (cleaner than filter + row_number)
cur_group_rows(): row indices within current group

For modern dplyr code, prefer slice_max(col, n) over arrange(desc(col)) |> filter(row_number() <= n).

FAQ

What does row_number do in dplyr?

row_number() returns 1, 2, 3, ... sequential integers. row_number(x) ranks by x with ties broken by first appearance.

What is the difference between row_number, min_rank, and dense_rank?

row_number always produces unique ranks (ties broken by row order). min_rank gives ties the same rank but leaves gaps (1, 2, 2, 4). dense_rank gives ties the same rank with no gaps (1, 2, 2, 3).

How do I get top n rows per group with row_number?

df |> group_by(g) |> arrange(desc(col)) |> filter(row_number() <= n). In dplyr 1.1+, slice_max(col, n, by = g) is cleaner.

Why does row_number reset on grouped tibbles?

Because dplyr applies it per group. To get a global row index, call ungroup() first or use mutate(id = 1:n()) outside the grouping.

What is the difference between row_number() and 1:n()?

1:n() is also valid inside dplyr verbs and produces the same result. row_number() is a window function with full dplyr support; 1:n() is shorter but functionally equivalent for the no-arg case.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

dplyr row_number() in R: Assign Sequential Row Indexes

What row_number() does in one sentence

Syntax

Five common patterns

1. Add a sequential index

2. Rank by a column

3. Rank descending

4. Per-group row numbers

5. Top n per group (filter)

row_number() vs min_rank() vs dense_rank() vs rank()

A practical workflow

Common pitfalls

Try it yourself

FAQ

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

dplyr row_number() in R: Assign Sequential Row Indexes

What row_number() does in one sentence

Syntax

Five common patterns

1. Add a sequential index

2. Rank by a column

3. Rank descending

4. Per-group row numbers

5. Top n per group (filter)

row_number() vs min_rank() vs dense_rank() vs rank()

A practical workflow

Common pitfalls

Try it yourself

Related dplyr functions

FAQ