ggplot2 geom_col() in R: Bar Charts From Pre-Computed Heights
The geom_col() function in ggplot2 draws bar charts where the bar HEIGHT is taken directly from a y aesthetic. It is the right choice when you have pre-computed totals or values, unlike geom_bar() which counts rows.
ggplot(df, aes(x, y)) + geom_col() ggplot(df, aes(x, y, fill = group)) + geom_col(position = "dodge") ggplot(df, aes(x, y, fill = group)) + geom_col(position = "stack") ggplot(df, aes(x, y, fill = group)) + geom_col(position = "fill") # 100% stacked ggplot(df, aes(x, y)) + geom_bar(stat = "identity") # equivalent to geom_col
Need explanation? Read on for examples and pitfalls.
What geom_col() does in one sentence
geom_col() draws a bar chart where each bar's height is the y-aesthetic value from the data. Unlike geom_bar (which counts rows by default), geom_col uses the values you provide directly.
Syntax
geom_col(mapping = NULL, data = NULL, position = "stack", ...). Default position is "stack".
geom_col when y is a value (height); use geom_bar when y comes from counting. Both produce bars; the difference is what's on the y axis.Five common patterns
1. Standard column chart
2. Filled by group
Default position is "stack".
3. Dodged bars
4. 100% stacked
5. Horizontal bars
geom_col() is geom_bar(stat = "identity") in disguise. The two produce identical output for the same y values; geom_col is just shorter and clearer when you have pre-computed heights.geom_col() vs geom_bar() vs geom_histogram()
| Function | Default stat | Best for |
|---|---|---|
geom_col() |
identity | Pre-computed heights |
geom_bar() |
count | Frequency of categories |
geom_histogram() |
bin | Numeric distribution |
When to use which:
- geom_col when y is your data's value.
- geom_bar when y should be a count of rows.
- geom_histogram for numeric x with binning.
A practical workflow
The "summarise then plot" pattern uses geom_col directly.
Compute totals first, then plot. geom_col plots the totals directly.
Common pitfalls
Pitfall 1: forgetting y is value not count. geom_col EXPECTS a y aesthetic with values. If you pass count-style data without aggregation, you'll plot wrong heights.
Pitfall 2: factor ordering. Bars appear in alphabetical / factor-level order. Use forcats::fct_reorder to sort by value.
geom_col() (and geom_bar()) is "stack". When you have multiple bars at the same x, they stack. For side-by-side, pass position = "dodge".Try it yourself
Try it: Plot a bar chart of mean mpg per cyl. Save to ex_plot.
Click to reveal solution
Explanation: Compute mean per cyl, then plot directly with geom_col.
Related ggplot2 functions
After mastering geom_col, look at:
geom_bar(): count-based barsgeom_histogram(): numeric distributiongeom_point(): scatterposition_dodge()/position_stack()/position_fill(): positioningcoord_flip(): rotate axes
FAQ
What does geom_col do in ggplot2?
geom_col() draws bar charts where each bar's height comes directly from the y aesthetic. Equivalent to geom_bar(stat = "identity").
What is the difference between geom_col and geom_bar?
geom_col uses identity stat (height from y). geom_bar defaults to count stat (height from row count). Use geom_col for pre-computed values; geom_bar for raw counting.
How do I make bars side by side?
Pass position = "dodge": geom_col(position = "dodge"). Default is stack.
How do I sort bars by value?
Use forcats::fct_reorder(category, value) for the x aesthetic. ggplot draws factor levels in order.
Can I make horizontal bars with geom_col?
Yes. Either swap aesthetics: aes(value, category), or chain + coord_flip() after.