dplyr Error: must return a single string in group_by — Exact Fix
When summarise() returns more than one value per group or a value of the wrong type, dplyr throws an error. The message varies by version but the root cause is the same: your summary function doesn't collapse each group to a single scalar value.
The Error
# The error looks like one of these (depends on dplyr version):
# Error: Column `x` must return a single string or a length 1 vector
# Error: `x` must be a single string, not a character vector of length 3
cat("This error means your summarise() function returned multiple values\n")
cat("when dplyr expected exactly one value per group.\n")
Cause 1: Summary Function Returns Multiple Values
Using a function that returns a vector instead of a scalar inside summarise():
df <- data.frame(
group = c("A", "A", "A", "B", "B", "B"),
value = c(10, 20, 30, 40, 50, 60)
)
# range() returns 2 values (min and max) - this would error:
# library(dplyr)
# df %>% group_by(group) %>% summarise(r = range(value))
# Fix: use functions that return exactly one value
cat("Single-value functions:\n")
cat(" mean(), median(), sum(), min(), max(), n(), first(), last()\n")
cat(" sd(), var(), IQR(), n_distinct()\n\n")
# If you need range, split into two columns:
result <- aggregate(value ~ group, data = df,
FUN = function(x) c(min = min(x), max = max(x)))
cat("Min and max per group:\n")
print(result)
Fix: Use scalar summary functions (mean, sum, min, max, n). If you need multiple values, create separate columns for each.
Cause 2: Pasting Multiple Values Without Collapsing
Trying to combine text values per group without collapse:
df <- data.frame(
team = c("A", "A", "B", "B", "B"),
player = c("Alice", "Bob", "Carol", "Dave", "Eve")
)
# paste() without collapse returns a vector per group - error!
# df %>% group_by(team) %>% summarise(players = paste(player))
# Fix: use collapse to combine into a single string
result <- aggregate(player ~ team, data = df,
FUN = function(x) paste(x, collapse = ", "))
cat("Players per team:\n")
print(result)
Fix: Always use paste(x, collapse = ", ") inside summarise() when combining text, not just paste(x).
Cause 3: Wrong Function for the Data Type
Using a numeric summary on character data, or vice versa:
df <- data.frame(
group = c("A", "A", "B", "B"),
status = c("pass", "fail", "pass", "pass"),
score = c(85, 60, 90, 78)
)
# mean() on character column would error
# df %>% group_by(group) %>% summarise(avg = mean(status))
# Fix: use appropriate functions for each type
cat("For numeric columns: mean(), sum(), sd()\n")
cat("For character columns: paste(collapse=','), first(), n_distinct()\n")
cat("For logical columns: sum() (counts TRUE), mean() (proportion TRUE)\n\n")
# Correct approach
result <- aggregate(cbind(pass_rate = score) ~ group, data = df, FUN = mean)
print(result)
Fix: Match your summary function to the column type. Use str(df) to check types before summarizing.
Cause 4: Using list() or c() in summarise
Returning a list or combined vector instead of a scalar:
df <- data.frame(
category = c("X", "X", "Y", "Y"),
val = c(1, 2, 3, 4)
)
# These would error in summarise():
# summarise(df, result = c(mean(val), sd(val))) # returns length 2
# summarise(df, result = list(val)) # returns a list
# Fix: one column per summary statistic
cat("Correct pattern:\n")
cat(" summarise(mean_val = mean(val), sd_val = sd(val))\n\n")
result <- data.frame(
category = c("X", "Y"),
mean_val = c(mean(df$val[df$category == "X"]),
mean(df$val[df$category == "Y"])),
sd_val = c(sd(df$val[df$category == "X"]),
sd(df$val[df$category == "Y"]))
)
print(result)
Fix: Create a separate column for each summary statistic instead of combining them into one column.
Practice Exercise
# Exercise: Fix this grouped summary so it returns one row per group.
# For each group, compute:
# - avg_score: mean of score
# - players: all names combined into one string
# - score_range: max(score) - min(score) (single number)
df <- data.frame(
team = c("Red", "Red", "Red", "Blue", "Blue"),
name = c("Ana", "Ben", "Cat", "Dan", "Eve"),
score = c(88, 72, 95, 80, 91)
)
# Write your solution using base R or dplyr:
Click to reveal solution
```r
df <- data.frame(
team = c("Red", "Red", "Red", "Blue", "Blue"),
name = c("Ana", "Ben", "Cat", "Dan", "Eve"),
score = c(88, 72, 95, 80, 91)
)
# Base R solution using aggregate and merge
avg <- aggregate(score ~ team, data = df, FUN = mean)
names(avg)[2] <- "avg_score"
players <- aggregate(name ~ team, data = df,
FUN = function(x) paste(x, collapse = ", "))
names(players)[2] <- "players"
score_range <- aggregate(score ~ team, data = df,
FUN = function(x) max(x) - min(x))
names(score_range)[2] <- "score_range"
result <- merge(merge(avg, players), score_range)
print(result)
**Explanation:** Each summary function returns exactly one value per group: `mean()` for avg_score, `paste(collapse=", ")` for players, and `max() - min()` for score_range. Each produces a scalar per group, avoiding the error.
Summary
Cause
Fix
Prevention
Function returns multiple values
Use scalar functions (mean, sum, min)
Check function return length
paste() without collapse
Add collapse = ", "
Always use collapse in summarise
Wrong type for function
Match function to column type
Check str(df) first
c() or list() in summarise
One column per statistic
Create separate named columns
FAQ
What changed between dplyr versions regarding this error?
In dplyr 1.0+, summarise() can return multiple rows per group if .groups is set. But each column must still be length 1 or the same length. In older versions, any non-scalar result was an error. If your code works in one version but not another, check packageVersion("dplyr").
Can I return a list column from summarise?
Yes. Wrap the result in list(): summarise(data = list(val)) creates a list column. This is useful for nested data frames but is an advanced pattern. For simple summaries, stick to scalar values.
What's Next?
R Error: could not find function 'X' — namespace and package conflicts
ggplot2 Error: object 'X' not found — data vs environment mapping
R Common Errors — the full reference of 50 common errors