ggplot2 Error: 'Aesthetics must be length 1 or same as data', Solved

Error: Aesthetics must be either length 1 or the same as the data fires whenever a variable you pass into aes() doesn't have exactly 1 value or exactly nrow(data) values. The fix is almost always to attach the variable to your data frame as a column first, then map the column name inside aes().

What does 'Aesthetics must be length 1 or same as data' actually mean?

ggplot2 builds plots row by row. Every aesthetic you map, colour, size, fill, shape, must therefore have exactly one value per row, or exactly one value total that gets recycled for every row. Anything in between is ambiguous, so ggplot2 refuses to guess and stops with this error. The message even tells you which aesthetic broke and how many rows it expected, both of which are your first debugging clues.

Here is the smallest reproduction and its fix, side by side:

RReproduce the aesthetics length error
# Load ggplot2 once for the whole tutorial library(ggplot2) df <- data.frame( x = 1:5, y = c(3, 7, 2, 9, 4) ) # ---- BROKEN: external vector has length 3, not 5 ---- bad_colors <- c("red", "blue", "green") # ggplot(df, aes(x, y, colour = bad_colors)) + geom_point(size = 4) #> Error: Aesthetics must be either length 1 or the same as the data (5): colour # ---- FIXED: attach the aesthetic to the data frame first ---- df$group <- c("low", "high", "low", "high", "low") p <- ggplot(df, aes(x, y, colour = group)) + geom_point(size = 4) + labs(title = "Mapping a column works, mapping a stray vector doesn't") print(p)

  

Notice the two clues hidden inside the error message: (5) is the row count ggplot2 expected, and colour is the exact aesthetic that received the wrong length. When you see this error in your own code, read those two tokens first, they tell you which mapping to look at and what length it should have been.

Try it: Map a categorical column to size so the plot renders without error. Use the data frame and aesthetic sizes provided.

RExercise: length-4 size column
# Try it: fix the broken size mapping below ex_df <- data.frame(x = 1:4, y = c(2, 5, 3, 6)) ex_sizes <- c(2, 6) # only 2 values for 4 rows, too short # Your task: attach a valid size column to ex_df, then map it in aes() # ggplot(ex_df, aes(x, y, size = ???)) + geom_point() #> Expected: a 4-point scatter with varying point sizes, no error

  
Click to reveal solution
RLength-4 size solution
ex_df$point_size <- c(2, 4, 6, 8) # length 4, matches nrow(ex_df) ggplot(ex_df, aes(x, y, size = point_size)) + geom_point() + scale_size_identity()

  

Explanation: We attach a length-4 numeric vector to ex_df as a new column, then map the column by name inside aes(). scale_size_identity() tells ggplot2 to use the numeric values directly as point sizes rather than rescaling them.

How do you fix a standalone vector that has the wrong length?

This is the single most common trigger. You define a helper vector outside the data frame, colours, labels, flags, then pass it straight into aes(). The moment its length doesn't match nrow(data), ggplot2 stops. The fix is mechanical: add the vector as a column to the data frame first, then map by name.

RFix a standalone helper vector
sales_df <- data.frame( month = month.abb[1:5], revenue = c(120, 180, 150, 210, 175) ) my_colors <- c("red", "blue", "green") # length 3 cat("Rows in sales_df:", nrow(sales_df), "\n") cat("Length of my_colors:", length(my_colors), "\n") #> Rows in sales_df: 5 #> Length of my_colors: 3 # ---- FIX: attach as a column with the correct length ---- sales_df$bucket <- c("low", "high", "low", "high", "high") p2 <- ggplot(sales_df, aes(month, revenue, fill = bucket)) + geom_col() + labs(title = "Mapping a column keeps aesthetic length in sync with data") print(p2)

  

Why prefer the column approach even when the lengths happen to match? Because the moment you filter or reorder the data, a stray external vector stops lining up but a column comes along for the ride. Column mapping is the habit that prevents this error from ever coming back.

Tip
Always map by column name, not by stray vector. Even when the lengths line up today, attaching the aesthetic to the data frame means any downstream filter, arrange, or join keeps data and aesthetic perfectly aligned.

Try it: You have 6 students and a highlight vector with only 3 values. Attach a correct-length highlight column and plot it.

RExercise: length-6 highlight column
# Try it: fix the highlight length so this plots without error ex_scores <- data.frame( student = paste0("S", 1:6), math = c(80, 90, 75, 88, 92, 70) ) ex_highlight <- c(TRUE, FALSE, TRUE) # length 3, need length 6 # Your task: attach a length-6 logical column to ex_scores, then map fill = # ggplot(ex_scores, aes(student, math, fill = ???)) + geom_col() #> Expected: a bar chart with some bars highlighted, no error

  
Click to reveal solution
RHighlight-column solution
ex_scores$highlight <- ex_scores$math >= 85 # logical vector length 6 ggplot(ex_scores, aes(student, math, fill = highlight)) + geom_col() + scale_fill_manual(values = c("grey70", "steelblue"))

  

Explanation: Instead of trying to reuse ex_highlight, we derive a logical column from the data itself. This guarantees the length matches and the highlight rule is self-documenting.

How do you plot summary statistics next to raw data?

The second common pattern: you compute a per-group mean and try to map it onto a plot of the raw data. The summary has one row per group, the raw data has many, so the lengths collide. There are two clean fixes, pick the one that matches your intent.

RBroadcast per-group mean to raw rows
library(dplyr) # Compute per-cyl mean mpg, this has 3 rows, not 32 cars_mean <- mtcars |> group_by(cyl) |> summarise(mean_mpg = mean(mpg), .groups = "drop") cat("nrow(mtcars):", nrow(mtcars), "\n") cat("nrow(cars_mean):", nrow(cars_mean), "\n") #> nrow(mtcars): 32 #> nrow(cars_mean): 3 # ---- FIX (a): attach per-group means back to every raw row ---- mt_enriched <- mtcars |> group_by(cyl) |> mutate(mean_mpg = mean(mpg)) |> ungroup() p3 <- ggplot(mt_enriched, aes(wt, mpg, colour = factor(cyl))) + geom_point(size = 2) + geom_hline(aes(yintercept = mean_mpg, colour = factor(cyl)), linetype = "dashed") + labs(title = "Per-group mean attached to every row", colour = "cyl") print(p3)

  

Fix (a) works because mutate() inside group_by() broadcasts the group mean back to every row in that group, so mean_mpg becomes a length-32 column that ggplot2 accepts without complaint. Fix (b), shown in the Complete Example below, uses cars_mean as its own layer with data = cars_mean, equally valid, and the right choice when you don't want the summary polluting the raw data frame.

Key Insight
Each ggplot layer validates its own aesthetic lengths against its own data. Summaries belong either broadcast into the parent frame via mutate() or passed as a separate layer's data argument, never squeezed into the parent aes directly.

Try it: Attach mean(mpg) per cyl group to a copy of mtcars without losing any rows.

RExercise: add groupmpg column
# Try it: end up with ex_mtcars having the same 32 rows plus a group_mpg column ex_mtcars <- mtcars # Your task: add group_mpg so it equals mean(mpg) within each cyl group # ex_mtcars <- ex_mtcars |> ... #> Expected: nrow == 32 and length(unique(ex_mtcars$group_mpg)) == 3

  
Click to reveal solution
Rgroupmpg broadcast solution
ex_mtcars <- mtcars |> group_by(cyl) |> mutate(group_mpg = mean(mpg)) |> ungroup() nrow(ex_mtcars) #> [1] 32 length(unique(ex_mtcars$group_mpg)) #> [1] 3

  

Explanation: mutate() inside group_by() computes the mean per group but assigns it back to every row in that group, preserving the 32-row shape while producing exactly 3 distinct mean values.

How do you combine layers from different data frames?

The third common cause hides inside multi-layer plots. When you add geom_text() or geom_point() with its own data argument, the new layer still inherits the parent aes() mappings by default. If the annotation frame doesn't have a column the parent aes() references, or has a different row count, you get the length error.

RCombine layers with inherit.aes FALSE
pts <- data.frame(x = 1:10, y = rnorm(10, mean = 5)) labels <- data.frame(x = c(3, 7), label = c("Peak", "Dip")) # ---- BROKEN: geom_text inherits y from pts but labels has no y ---- # ggplot(pts, aes(x, y)) + # geom_line() + # geom_text(data = labels, aes(x = x, label = label), vjust = -1) #> Error: Aesthetics must be either length 1 or the same as the data (2): y # ---- FIXED: turn off inheritance and supply every aesthetic the layer needs ---- p4 <- ggplot(pts, aes(x, y)) + geom_line() + geom_point(size = 2) + geom_text(data = labels, aes(x = x, y = 6, label = label), inherit.aes = FALSE, vjust = -0.5, fontface = "bold") + labs(title = "inherit.aes = FALSE lets layers bring their own data shape") print(p4)

  

The fix has two parts. First, inherit.aes = FALSE tells geom_text() to ignore the parent mapping, so it stops demanding a y column on labels. Second, you must then supply every aesthetic the geom needs inside its local aes(). Here y = 6 is a constant, so it becomes length 1 and passes the recycling rule trivially.

Warning
Forgetting inherit.aes = FALSE is the top cause of multi-layer plot errors. If your annotation frame doesn't carry the same columns as the parent data, inheritance will hand the geom a mismatched aesthetic and trigger the length error.

Try it: Add a 2-row label layer on top of a 10-point scatter without triggering the length error.

RExercise: add inherit.aes labels
# Try it: add the labels layer so both plots render ex_pts <- data.frame(x = 1:10, y = runif(10, 0, 10)) ex_labels <- data.frame(x = c(2, 8), name = c("Start", "End")) # Your task: plot ex_pts as points and ex_labels as text above the line, # using inherit.aes = FALSE so the text layer doesn't demand y from ex_pts # ggplot(ex_pts, aes(x, y)) + geom_point() + ... #> Expected: 10 points plus two text labels, no error

  
Click to reveal solution
RLabel-layer solution
ggplot(ex_pts, aes(x, y)) + geom_point() + geom_text(data = ex_labels, aes(x = x, y = 11, label = name), inherit.aes = FALSE, fontface = "bold")

  

Explanation: inherit.aes = FALSE blocks the parent aes(x, y) from leaking into geom_text. Inside the local aes(), we supply x from ex_labels, set y = 11 as a length-1 constant, and pull label from the same frame.

Why do lingering factor levels still cause length errors?

The fourth cause is subtler. When you filter a data frame whose grouping column is a factor, the levels persist even after the rows are gone. A scale_*_manual() call built around a 3-level palette then meets a 2-level subset, or a 4-level plot built against your expectations, and the length mismatch resurfaces. droplevels() on the filtered data is the clean fix.

RFilter leaves unused factor levels
grade_df <- data.frame( category = factor(c("A", "B", "C", "A", "B", "C")), score = c(10, 20, 30, 15, 25, 35) ) grade_sub <- grade_df[grade_df$category %in% c("A", "B"), ] cat("Rows after filter:", nrow(grade_sub), "\n") cat("Levels still present:", nlevels(grade_sub$category), "\n") #> Rows after filter: 4 #> Levels still present: 3 # ---- FIX: droplevels() removes the lingering unused levels ---- grade_sub$category <- droplevels(grade_sub$category) cat("Levels after droplevels:", nlevels(grade_sub$category), "\n") #> Levels after droplevels: 2 p5 <- ggplot(grade_sub, aes(category, score, fill = category)) + geom_col() + scale_fill_manual(values = c("steelblue", "tomato")) + labs(title = "Two levels, two colours, after droplevels()") print(p5)

  

Without droplevels(), that scale_fill_manual() call with two colours would have fired the exact same length error, because the factor still carried three levels internally even though no row referenced level C. Any time you filter a factor column, assume you need droplevels() before plotting with manual scales.

Note
dplyr::filter() behaves the same way, it drops rows but not levels. If you use tidyverse-style filtering, run droplevels() on the result or wrap your category with forcats::fct_drop() for the same effect.

Try it: Drop unused levels from a filtered factor and confirm the level count shrinks to match the data.

RExercise: drop unused factor level
# Try it: drop the unused level from ex_sub ex_factor_df <- data.frame( g = factor(c("x", "y", "z", "x", "y", "z")), v = 1:6 ) ex_sub <- ex_factor_df[ex_factor_df$g != "z", ] # Your task: drop the unused level so nlevels(ex_sub$g) == 2 # ex_sub$g <- ??? #> Expected: nlevels(ex_sub$g) == 2

  
Click to reveal solution
Rdroplevels solution
ex_sub$g <- droplevels(ex_sub$g) nlevels(ex_sub$g) #> [1] 2

  

Explanation: droplevels() returns a factor with the same labels but pruned of any level that no longer appears in the data. Run it after every subset that touches a factor column.

Practice Exercises

Exercise 1: Per-group mean line on a scatter plot

Using mtcars, compute the mean mpg for each cyl group, attach it back to every row, and plot a scatter of wt vs mpg with a dashed horizontal line per group showing its mean. The naive version below is broken, fix it. Save the final plot to my_p1.

RExercise: per-group mean line plot
# Exercise 1: fix the broken per-group mean line plot # Hint: use group_by() + mutate(mean_mpg = mean(mpg)) to broadcast the mean my_mt <- mtcars # Your code below, produce my_p1 as a ggplot object

  
Click to reveal solution
RPer-group mean-line solution
my_mt <- mtcars |> group_by(cyl) |> mutate(mean_mpg = mean(mpg)) |> ungroup() my_p1 <- ggplot(my_mt, aes(wt, mpg, colour = factor(cyl))) + geom_point(size = 2) + geom_hline(aes(yintercept = mean_mpg, colour = factor(cyl)), linetype = "dashed") + labs(title = "Scatter with per-cyl mean line", colour = "cyl") print(my_p1)

  

Explanation: Broadcasting the per-group mean with mutate() inside group_by() produces a length-32 column, so it matches nrow(mtcars). The dashed geom_hline() then maps cleanly against the same 32-row frame without a length error.

Exercise 2: Text annotations from a separate data frame

Build a scatter from a 10-row my_scatter data frame, then overlay 2 text labels from a separate 2-row my_labels data frame. The challenge is that my_labels has no y column, so the default inherited aes() breaks. Save the final plot to my_p2.

RExercise: external text annotations
# Exercise 2: add text annotations from a second data frame # Hint: use inherit.aes = FALSE and supply every aesthetic inside the local aes() my_scatter <- data.frame(x = 1:10, y = c(2, 4, 3, 7, 5, 8, 6, 9, 7, 10)) my_labels <- data.frame(x = c(3, 8), label = c("Low", "High")) # Your code below, produce my_p2

  
Click to reveal solution
RExternal-annotation solution
my_p2 <- ggplot(my_scatter, aes(x, y)) + geom_point(size = 3) + geom_text(data = my_labels, aes(x = x, y = 11, label = label), inherit.aes = FALSE, fontface = "bold") + labs(title = "Scatter with external annotations") print(my_p2)

  

Explanation: inherit.aes = FALSE prevents my_labels from being checked against the parent aes(x, y). Inside the local aes() we supply x from my_labels, fix y to a constant 11 (length 1 recycles freely), and map label from the same frame.

Complete Example

Here is an end-to-end walkthrough using iris. The goal: a scatter of Sepal.Length vs Sepal.Width coloured by Species, with a dashed horizontal line per species showing its mean Sepal.Length. We'll build it the right way from the start.

REnd-to-end iris mean-line plot
iris_m <- iris |> group_by(Species) |> mutate(mean_sl = mean(Sepal.Length)) |> ungroup() cat("nrow(iris_m):", nrow(iris_m), "\n") cat("Distinct mean_sl values:", length(unique(iris_m$mean_sl)), "\n") #> nrow(iris_m): 150 #> Distinct mean_sl values: 3 final_p <- ggplot(iris_m, aes(Sepal.Width, Sepal.Length, colour = Species)) + geom_point(size = 2, alpha = 0.8) + geom_hline(aes(yintercept = mean_sl, colour = Species), linetype = "dashed") + labs(title = "Sepal length by species with per-species mean", x = "Sepal width", y = "Sepal length") print(final_p)

  

Every aesthetic in this plot either has length 150 (matching nrow(iris_m)) or length 1 (constants like size = 2). mean_sl is length 150 with only 3 distinct values, perfect for a per-group reference line. No error, no warnings, no droplevels() gymnastics.

Summary

Cause Error signature Fix
Standalone vector with wrong length Aesthetics must be... (nrow): colour Attach as column, map by name
Summary mixed with raw data Aesthetics must be... (nrow): yintercept Use group_by() + mutate() to broadcast
Multi-layer with inherited aes() Aesthetics must be... (2): y inherit.aes = FALSE on child layer
Lingering factor levels Aesthetics must be... (n): fill droplevels() after filtering

The one rule behind all four: every aesthetic must be length 1 or nrow(data_in_that_layer). Read the number in the parentheses of the error message to see which length ggplot2 expected.

References

  1. ggplot2 documentation, aes() reference. Link
  2. Wickham, H. (2010), A Layered Grammar of Graphics. Journal of Computational and Graphical Statistics, 19(1). Link
  3. Wickham, H., ggplot2: Elegant Graphics for Data Analysis, 3rd edition, Springer (2016). Link
  4. tidyverse/ggplot2 GitHub issue #1366, history of the length-check breaking change. Link
  5. dplyr documentation, mutate() + group_by() reference. Link
  6. Posit Community forum, "Error: Aesthetics must be either length 1 or the same as the data" discussion thread. Link

Continue Learning

  1. R Error in ggplot2: object 'x' not found, aes() scoping and environment lookup issues.
  2. R Error: replacement has N rows, data has M, the sibling length-mismatch error on the data-wrangling side.
  3. 50 R Errors Decoded, the master reference of the most common R error messages with plain-English fixes.