ggplot2 Error: 'Aesthetics must be length 1 or same as data', Solved
Error: Aesthetics must be either length 1 or the same as the data fires whenever a variable you pass into aes() doesn't have exactly 1 value or exactly nrow(data) values. The fix is almost always to attach the variable to your data frame as a column first, then map the column name inside aes().
What does 'Aesthetics must be length 1 or same as data' actually mean?
ggplot2 builds plots row by row. Every aesthetic you map, colour, size, fill, shape, must therefore have exactly one value per row, or exactly one value total that gets recycled for every row. Anything in between is ambiguous, so ggplot2 refuses to guess and stops with this error. The message even tells you which aesthetic broke and how many rows it expected, both of which are your first debugging clues.
Here is the smallest reproduction and its fix, side by side:
Notice the two clues hidden inside the error message: (5) is the row count ggplot2 expected, and colour is the exact aesthetic that received the wrong length. When you see this error in your own code, read those two tokens first, they tell you which mapping to look at and what length it should have been.
Try it: Map a categorical column to size so the plot renders without error. Use the data frame and aesthetic sizes provided.
Click to reveal solution
Explanation: We attach a length-4 numeric vector to ex_df as a new column, then map the column by name inside aes(). scale_size_identity() tells ggplot2 to use the numeric values directly as point sizes rather than rescaling them.
How do you fix a standalone vector that has the wrong length?
This is the single most common trigger. You define a helper vector outside the data frame, colours, labels, flags, then pass it straight into aes(). The moment its length doesn't match nrow(data), ggplot2 stops. The fix is mechanical: add the vector as a column to the data frame first, then map by name.
Why prefer the column approach even when the lengths happen to match? Because the moment you filter or reorder the data, a stray external vector stops lining up but a column comes along for the ride. Column mapping is the habit that prevents this error from ever coming back.
Try it: You have 6 students and a highlight vector with only 3 values. Attach a correct-length highlight column and plot it.
Click to reveal solution
Explanation: Instead of trying to reuse ex_highlight, we derive a logical column from the data itself. This guarantees the length matches and the highlight rule is self-documenting.
How do you plot summary statistics next to raw data?
The second common pattern: you compute a per-group mean and try to map it onto a plot of the raw data. The summary has one row per group, the raw data has many, so the lengths collide. There are two clean fixes, pick the one that matches your intent.
Fix (a) works because mutate() inside group_by() broadcasts the group mean back to every row in that group, so mean_mpg becomes a length-32 column that ggplot2 accepts without complaint. Fix (b), shown in the Complete Example below, uses cars_mean as its own layer with data = cars_mean, equally valid, and the right choice when you don't want the summary polluting the raw data frame.
Try it: Attach mean(mpg) per cyl group to a copy of mtcars without losing any rows.
Click to reveal solution
Explanation: mutate() inside group_by() computes the mean per group but assigns it back to every row in that group, preserving the 32-row shape while producing exactly 3 distinct mean values.
How do you combine layers from different data frames?
The third common cause hides inside multi-layer plots. When you add geom_text() or geom_point() with its own data argument, the new layer still inherits the parent aes() mappings by default. If the annotation frame doesn't have a column the parent aes() references, or has a different row count, you get the length error.
The fix has two parts. First, inherit.aes = FALSE tells geom_text() to ignore the parent mapping, so it stops demanding a y column on labels. Second, you must then supply every aesthetic the geom needs inside its local aes(). Here y = 6 is a constant, so it becomes length 1 and passes the recycling rule trivially.
Try it: Add a 2-row label layer on top of a 10-point scatter without triggering the length error.
Click to reveal solution
Explanation: inherit.aes = FALSE blocks the parent aes(x, y) from leaking into geom_text. Inside the local aes(), we supply x from ex_labels, set y = 11 as a length-1 constant, and pull label from the same frame.
Why do lingering factor levels still cause length errors?
The fourth cause is subtler. When you filter a data frame whose grouping column is a factor, the levels persist even after the rows are gone. A scale_*_manual() call built around a 3-level palette then meets a 2-level subset, or a 4-level plot built against your expectations, and the length mismatch resurfaces. droplevels() on the filtered data is the clean fix.
Without droplevels(), that scale_fill_manual() call with two colours would have fired the exact same length error, because the factor still carried three levels internally even though no row referenced level C. Any time you filter a factor column, assume you need droplevels() before plotting with manual scales.
dplyr::filter() behaves the same way, it drops rows but not levels. If you use tidyverse-style filtering, run droplevels() on the result or wrap your category with forcats::fct_drop() for the same effect.Try it: Drop unused levels from a filtered factor and confirm the level count shrinks to match the data.
Click to reveal solution
Explanation: droplevels() returns a factor with the same labels but pruned of any level that no longer appears in the data. Run it after every subset that touches a factor column.
Practice Exercises
Exercise 1: Per-group mean line on a scatter plot
Using mtcars, compute the mean mpg for each cyl group, attach it back to every row, and plot a scatter of wt vs mpg with a dashed horizontal line per group showing its mean. The naive version below is broken, fix it. Save the final plot to my_p1.
Click to reveal solution
Explanation: Broadcasting the per-group mean with mutate() inside group_by() produces a length-32 column, so it matches nrow(mtcars). The dashed geom_hline() then maps cleanly against the same 32-row frame without a length error.
Exercise 2: Text annotations from a separate data frame
Build a scatter from a 10-row my_scatter data frame, then overlay 2 text labels from a separate 2-row my_labels data frame. The challenge is that my_labels has no y column, so the default inherited aes() breaks. Save the final plot to my_p2.
Click to reveal solution
Explanation: inherit.aes = FALSE prevents my_labels from being checked against the parent aes(x, y). Inside the local aes() we supply x from my_labels, fix y to a constant 11 (length 1 recycles freely), and map label from the same frame.
Complete Example
Here is an end-to-end walkthrough using iris. The goal: a scatter of Sepal.Length vs Sepal.Width coloured by Species, with a dashed horizontal line per species showing its mean Sepal.Length. We'll build it the right way from the start.
Every aesthetic in this plot either has length 150 (matching nrow(iris_m)) or length 1 (constants like size = 2). mean_sl is length 150 with only 3 distinct values, perfect for a per-group reference line. No error, no warnings, no droplevels() gymnastics.
Summary
| Cause | Error signature | Fix |
|---|---|---|
| Standalone vector with wrong length | Aesthetics must be... (nrow): colour |
Attach as column, map by name |
| Summary mixed with raw data | Aesthetics must be... (nrow): yintercept |
Use group_by() + mutate() to broadcast |
| Multi-layer with inherited aes() | Aesthetics must be... (2): y |
inherit.aes = FALSE on child layer |
| Lingering factor levels | Aesthetics must be... (n): fill |
droplevels() after filtering |
The one rule behind all four: every aesthetic must be length 1 or nrow(data_in_that_layer). Read the number in the parentheses of the error message to see which length ggplot2 expected.
References
- ggplot2 documentation,
aes()reference. Link - Wickham, H. (2010), A Layered Grammar of Graphics. Journal of Computational and Graphical Statistics, 19(1). Link
- Wickham, H., ggplot2: Elegant Graphics for Data Analysis, 3rd edition, Springer (2016). Link
- tidyverse/ggplot2 GitHub issue #1366, history of the length-check breaking change. Link
- dplyr documentation,
mutate()+group_by()reference. Link - Posit Community forum, "Error: Aesthetics must be either length 1 or the same as the data" discussion thread. Link
Continue Learning
- R Error in ggplot2: object 'x' not found, aes() scoping and environment lookup issues.
- R Error: replacement has N rows, data has M, the sibling length-mismatch error on the data-wrangling side.
- 50 R Errors Decoded, the master reference of the most common R error messages with plain-English fixes.