dplyr slice_sample() in R: Random Rows From a Tibble

The slice_sample() function in dplyr returns a random sample of n rows (or a fraction) from a data frame, optionally per group, with or without replacement. It supersedes the older sample_n() and sample_frac().

By Selva Prabhakaran · Published May 12, 2026 · Last updated May 12, 2026

⚡ Quick Answer

slice_sample(df, n = 5)                       # 5 random rows
slice_sample(df, prop = 0.1)                  # 10% of rows
slice_sample(df, n = 3, by = cyl)             # 3 per group (stratified)
slice_sample(df, n = 5, replace = TRUE)       # bootstrap
slice_sample(df, n = 5, weight_by = w)        # weighted sample
df |> group_by(g) |> slice_sample(n = 3)      # equivalent grouped form
set.seed(42); slice_sample(df, n = 5)         # reproducible

Need explanation? Read on for examples and pitfalls.

📊 Is slice_sample() the right tool?

What slice_sample() does in one sentence

slice_sample(.data, n) returns a random sample of n rows; slice_sample(.data, prop = 0.1) returns 10% of the rows. On a grouped tibble (or with by = g), the sampling happens within each group.

This is the modern, group-aware random sampler. For new code, prefer it over sample_n() (deprecated).

Syntax

slice_sample(.data, n = NULL, prop = NULL, weight_by = NULL, replace = FALSE, by = NULL). Pass n OR prop, not both.

Run live

Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.

R5 random cars

library(dplyr) set.seed(42) mtcars |> slice_sample(n = 5) #> mpg cyl ... #> Datsun 710 22.8 4 #> Camaro Z28 13.3 8 #> ... (5 random rows; depends on seed)

Tip

Always set set.seed() before sampling for reproducibility. Random samples without a seed differ across runs, which makes debugging and report regeneration harder.

Five common patterns

1. Random n rows

RPick 10 at random

set.seed(1) mtcars |> slice_sample(n = 10)

2. Random fraction

R20% sample

mtcars |> slice_sample(prop = 0.2)

20% of 32 = ~6 rows.

3. Stratified sample (per group)

REqual n per group

mtcars |> slice_sample(n = 3, by = cyl)

3 random cars per cylinder group, regardless of group size.

4. Bootstrap (with replacement)

RResample 32 rows with replacement

set.seed(1) mtcars |> slice_sample(n = nrow(mtcars), replace = TRUE)

Standard bootstrap: same row count as original, with duplicates allowed.

5. Weighted sample

RBias toward higher mpg

set.seed(1) mtcars |> slice_sample(n = 5, weight_by = mpg)

Rows with higher mpg are more likely to be picked. Useful for importance sampling.

Key Insight

slice_sample() replaces TWO older functions: sample_n() (n random rows) and sample_frac() (fraction). Both are deprecated since dplyr 1.0. The new function unifies them and adds by for stratified sampling.

slice_sample() vs sample_n() vs sample()

Three sampling functions in R, with different scope.

Function	Package	Per group	Status
`slice_sample(n)`	dplyr	Yes	Recommended
`sample_n(n)`	dplyr	Yes	Deprecated since 1.0
`base::sample(x, size)`	base	No	Vector sampling, not data frames

When to use which:

slice_sample for data frames in dplyr pipelines.
sample for sampling from vectors or generating random indices.
Avoid sample_n in new code.

A practical workflow

The "stratified sample" pattern is the most common slice_sample use case. Examples:

Train/test split with balanced classes: slice_sample(prop = 0.8, by = class)
Sample customers per region equally: slice_sample(n = 100, by = region)
Bootstrap CI estimation: loop slice_sample(n = nrow(df), replace = TRUE) 1,000 times

For one-off samples, slice_sample(n = 5) is the quick interactive tool. For production analysis, set the seed and document the sample size.

Common pitfalls

Pitfall 1: forgetting to set seed. slice_sample is non-deterministic. Reports / tests / docs should set.seed(42) (or any fixed integer) right before to get reproducible output.

Pitfall 2: per-group surprise. On a grouped tibble, slice_sample(n = 5) returns 5 rows PER GROUP. Often what you want for stratification, sometimes not. Use ungroup() first if you mean a global sample.

Warning

slice_sample(n = X) errors if any group has fewer than X rows AND replace = FALSE. A group with only 2 rows fails for n = 5. Either set replace = TRUE or filter groups by size first.

Reproducibility and seeds

Random samples are only useful if they are reproducible across runs. Always call set.seed(N) immediately before slice_sample() when the result will be used in a report, plot, or test. Different code paths that need DIFFERENT samples should use different seeds (e.g., set.seed(1) for train, set.seed(2) for test) so the splits are reproducible AND independent. R's RNG state is global, so any operation between set.seed() and slice_sample() that consumes randomness will desync the result. Keep them adjacent.

Try it yourself

Try it: Take a stratified random sample of 2 rows per cyl group from mtcars. Set seed 42 for reproducibility. Save to ex_strat.

RYour turn: stratified sample

set.seed(42) ex_strat <- mtcars |> # your code here ex_strat #> Expected: 6 rows (2 per cyl group)

Click to reveal solution

RSolution

set.seed(42) ex_strat <- mtcars |> slice_sample(n = 2, by = cyl) ex_strat #> 6 rows total: 2 per cyl group, randomly chosen

Explanation: slice_sample(n = 2, by = cyl) picks 2 random rows per cyl group. Equal-count stratified sample.

After mastering slice_sample, look at:

slice_head() / slice_tail(): positional first/last n
slice_max() / slice_min(): top/bottom n by value
slice(): specific row indexes
sample_n() / sample_frac(): deprecated; do not use
rsample package: train/test splits with class balance, k-fold CV
base::sample(): vector sampling

For machine-learning splits, the rsample package builds on slice_sample with initial_split() and vfold_cv().

FAQ

What is the difference between slice_sample and sample_n?

sample_n() is deprecated since dplyr 1.0. slice_sample() is the replacement: clearer name, supports prop, weight_by, and by arguments, group-aware.

How do I do a reproducible random sample in R?

Call set.seed(42) (or any fixed integer) immediately before slice_sample(). The same seed always returns the same rows.

How do I sample with replacement (bootstrap) in dplyr?

Pass replace = TRUE: slice_sample(df, n = nrow(df), replace = TRUE). The size matches the original; rows can repeat.

How do I do a weighted random sample?

Pass weight_by = column: slice_sample(df, n = 5, weight_by = price) makes rows with higher price more likely to be picked.

How do I do a stratified sample per group?

slice_sample(df, n = 3, by = group_col) returns 3 random rows per group. For a fraction per group, use prop = 0.1 instead of n.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

dplyr slice_sample() in R: Random Rows From a Tibble

What slice_sample() does in one sentence

Syntax

Five common patterns

1. Random n rows

2. Random fraction

3. Stratified sample (per group)

4. Bootstrap (with replacement)

5. Weighted sample

slice_sample() vs sample_n() vs sample()

A practical workflow

Common pitfalls

Reproducibility and seeds

Try it yourself

FAQ

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

dplyr slice_sample() in R: Random Rows From a Tibble

What slice_sample() does in one sentence

Syntax

Five common patterns

1. Random n rows

2. Random fraction

3. Stratified sample (per group)

4. Bootstrap (with replacement)

5. Weighted sample

slice_sample() vs sample_n() vs sample()

A practical workflow

Common pitfalls

Reproducibility and seeds

Try it yourself

Related slice functions

FAQ