dplyr ungroup() in R: Remove Grouping Before Next Step
The ungroup() function in dplyr removes the group structure that group_by() attached, so subsequent verbs operate on the whole data frame again. It is the explicit cleanup step every grouped pipeline needs.
df |> group_by(g) |> mutate(rank = rank(x)) |> ungroup() df |> group_by(g) |> summarise(n = n()) # auto-ungroups one level df |> group_by(g) |> ungroup() |> nrow() # drop grouping, count all df |> group_by(g1, g2) |> ungroup(g2) # remove ONE grouping var mutate(df, .by = g, rank = rank(x)) # alternative: per-call grouping, no ungroup needed
Need explanation? Read on for examples and pitfalls.
What ungroup() does in one sentence
ungroup(.data, ...) strips the groups attribute from a grouped_df, returning a regular tibble (or data frame) where the next verb sees every row in one bucket. With no extra args, it removes ALL grouping; with named columns, it removes only those.
This is how dplyr separates "compute per-group" from "compute across the whole table". Forgetting ungroup() is the single most common source of "why is mutate giving me odd numbers?" bugs.
Syntax
ungroup(x, ...). With ... empty it removes all grouping; with column names it removes those specific levels.
summarise() peels off ONE grouping level automatically; mutate() and filter() do NOT. After group_by(g1, g2) |> summarise(...) the result is still grouped by g1. Always ungroup() if downstream code shouldn't care.Five common patterns
1. Standard cleanup at the end of a grouped pipeline
This is the canonical pattern. Without ungroup(), downstream mutate()s would still operate per-group.
2. Drop only ONE grouping variable
ungroup(gear) removes that one column from the group structure.
3. After summarise() with multiple groups
.groups = "drop" is equivalent to chaining ungroup() after summarise.
4. Replace ungroup with .by (dplyr 1.1+)
.by scopes grouping to ONE verb only. No grouping leaks into the next step. For new code, this often replaces group_by() |> mutate() |> ungroup().
5. Ungroup before joining
Joining a grouped left side can carry grouping into the result, which is rarely what you want.
.by argument (dplyr >= 1.1.0) is the modern way to scope grouping per-verb without group_by()/ungroup() bookends. For one-step grouped computation, mutate(df, .by = g, x = ...) is cleaner. Reserve group_by() for multi-step grouped pipelines where every verb shares the same grouping.ungroup() vs .by vs summarise(.groups=)
Three ways to control grouping scope in dplyr.
| Approach | Scope | Best for |
|---|---|---|
group_by() + ungroup() |
Pipeline-wide, explicit cleanup | Multi-step grouped flows |
.by argument |
Single verb only | One-step grouped compute |
summarise(.groups = "drop") |
Auto-drop after summarise | Aggregations ending in summarise |
When to use which:
- Use
.byfor short pipelines where grouping applies to one verb. - Use
group_by()+ungroup()for long flows where the same groups apply to mutate, filter, AND summarise. - Use
.groups = "drop"inside summarise for aggregation-only flows.
Common pitfalls
Pitfall 1: forgetting to ungroup before downstream code. A grouped tibble looks identical when printed but nrow(), mutate(), and n() behave per-group. head(grouped_df, 3) returns 3 rows PER GROUP, not 3 total.
Pitfall 2: n() returns per-group counts. Inside mutate() on a grouped df, n() is the group size, not the total rows. Use dplyr::n() carefully or switch to nrow() after ungroup().
slice_head(n=3) operate PER GROUP on a grouped data frame. A grouped df with 4 groups returns 12 rows from slice_head(n=3), not 3. Ungroup first if you want 3 total.Try it yourself
Try it: Compute the within-cylinder mean MPG, then return an UNGROUPED data frame sorted by relative MPG. Save to ex_result.
Click to reveal solution
Explanation: Group by cyl, compute relative MPG within each group, ungroup so arrange() sorts the entire frame (not within groups).
Related dplyr functions
After mastering ungroup, look at:
group_by(): attach group structuregroups(): inspect current groupinggroup_split(): split a grouped df into a list of data framesrowwise(): group with one row per group (for non-vectorized ops)summarise(.groups = ...): control post-summarise grouping.byarg in mutate / filter / summarise: per-verb grouping
For modern dplyr code (>=1.1), prefer .by over group_by()/ungroup() when grouping applies to a single verb.
FAQ
What does ungroup do in dplyr?
ungroup() removes the grouping structure attached by group_by(). After ungroup(), subsequent verbs (mutate, filter, summarise) operate on the entire data frame instead of per-group.
Do I always need ungroup() after group_by()?
Not always. summarise() peels off one grouping level automatically. .by scopes grouping to one verb only. But for pipelines mixing group_by() with mutate() and downstream code, ungroup is the safe default.
What is the difference between ungroup() and .by in dplyr?
.by = column is a per-call grouping argument introduced in dplyr 1.1. It applies grouping to ONE verb only, no leftover state. ungroup() removes grouping that was attached by group_by(). For new code, .by is often cleaner.
How do I check if a data frame is grouped?
Use is_grouped_df(df) or groups(df). The former returns TRUE/FALSE; the latter shows the grouping columns.
Why is my mutate producing weird results after group_by?
Because mutate is computing PER GROUP. n(), mean(), rank(), etc., are group-scoped. To compute across the entire data frame, ungroup first.