tidyr nest() in R: Bundle Rows Into List Columns
The nest() function in tidyr collapses rows into list-columns where each cell holds a nested tibble. It is the foundation of "many-models" workflows and tidy hierarchical data.
df |> nest(.by = group) # one row per group, data column df |> group_by(g) |> nest() # legacy (works but .by preferred) df |> nest(data = c(col1, col2)) # specific columns nested df |> unnest(data) # opposite: flatten back mtcars |> nest(.by = cyl) |> mutate(model = map(data, ~ lm(mpg ~ wt, .x)))
Need explanation? Read on for examples and pitfalls.
What nest() does in one sentence
nest(data, .by = group) returns one row per group, with a list-column whose cells contain the rows for that group as nested tibbles. Foundation of list-column workflows.
Syntax
nest(data, ..., .by = NULL, .key = NULL, .names_sep = NULL). .by for grouping; ... for column-naming.
nest() with purrr::map() for the "many models" pattern. Each cell holds a tibble; map applies a function (like lm) to each.Five common patterns
1. Standard nest by group
2. Many models pattern
3. Per-group statistics
4. Nested with multiple grouping columns
5. Custom column nest
nest() enables the "many models" workflow: one row per group, model fitted per row, results extracted via map. This is the canonical tidyverse approach to per-group statistical analysis.nest() vs group_by() vs nest_join()
| Function | Output | Best for |
|---|---|---|
nest(.by = g) |
One row per group, list column | Many-models pattern |
group_by(g) |
Marker on data frame | Aggregation |
nest_join() |
Each x row + nested matches from y | Hierarchical join |
When to use which:
- nest for splitting data into per-group tibbles for further per-group operations.
- group_by for aggregation.
- nest_join for join-style hierarchical merging.
A practical workflow
The "fit, summarise, unnest" pattern for many-models analysis.
Per cyl group: fit a model, extract glance summary, unnest into columns. Standard tidymodels workflow.
Common pitfalls
Pitfall 1: forgetting to use map. After nest, the data column is a LIST. To compute on it, use map(data, fn).
Pitfall 2: mixing nest with group_by. Modern dplyr uses nest(.by = g) directly; older code does group_by(g) |> nest(). Both work; .by is cleaner.
nest() returns a TIBBLE WITH A LIST COLUMN, not a list of tibbles. This is important for downstream code: it can still be filtered, mutated, joined like a normal data frame.Try it yourself
Try it: Nest mtcars by gear and compute the row count per gear group. Save to ex_nested.
Click to reveal solution
Explanation: nest creates one row per gear; map_int(data, nrow) counts rows in each.
Related tidyr / purrr functions
After mastering nest, look at:
unnest(): opposite (flatten list column)unnest_longer()/unnest_wider(): targeted unnestingpurrr::map(): per-cell computation on list columnsnest_join(): nested-join (different)broom::glance()/broom::tidy(): model output extractiontidymodels: framework using nest extensively
FAQ
What does nest do in tidyr?
nest(data, .by = g) returns one row per group, with a list-column whose cells contain the rows for that group as nested tibbles.
What is the difference between nest and group_by?
group_by attaches a grouping marker; data shape is unchanged. nest CHANGES THE SHAPE: one row per group, with the original rows as nested tibbles.
How do I do many models with nest?
nest(.by = g) |> mutate(model = map(data, ~ lm(y ~ x, data = .x))). Each row holds a fitted model.
How do I flatten a nested data frame?
Use unnest(data) (or unnest_longer / unnest_wider). Reverses nest.
What is the .by argument in nest?
.by = column is the modern way to specify grouping (dplyr 1.1+). Replaces the older group_by(col) |> nest() pattern.