ggplot2 Bar Charts: geom_bar(), geom_col(), Stacked, Dodged and Ordered

Bar charts compare values across categories. In ggplot2, geom_bar() counts rows automatically while geom_col() uses a pre-computed height value, choosing the right one depends entirely on what your data looks like when it arrives.

By Selva Prabhakaran · Published May 12, 2026 · Last updated May 12, 2026

Introduction

Bar charts are deceptively simple to create but full of small traps. The most common confusion is which function to use, geom_bar() or geom_col(), and why using the wrong one produces a chart that either errors or lies about your data. The second most common trap is leaving bars in alphabetical order when the reader needs them sorted by size to extract meaning instantly.

In this tutorial, you will learn how to:

Choose between geom_bar() and geom_col() based on your data shape
Create stacked, dodged, and percent-stacked bar charts
Sort bars by value using fct_reorder()
Add data labels directly on or above bars
Flip to horizontal bars for long category names

All code blocks share a single WebR session, variables from earlier blocks carry forward.

What is the difference between geom_bar() and geom_col()?

This is the question that trips up nearly every ggplot2 beginner. Both functions draw bar charts, but they expect different input data.

Decision guide: geom_bar() for raw data, geom_col() for pre-computed values.

Figure 1: Decision guide: geom_bar() for raw data, geom_col() for pre-computed values.

geom_bar() takes raw, unaggregated data and counts how many rows fall into each category. You only supply x, no y needed. It runs stat_count() internally.

geom_col() takes data where you've already computed the heights (counts, averages, totals). You supply both x and y. It uses stat_identity(), it leaves the data as-is.

Let's set up data for both scenarios:

RRaw data versus precomputed averages

library(ggplot2) library(forcats) library(scales) # Scenario 1: raw data (one row per car) # mpg dataset - 234 rows, each row = one car model head(mpg[, c("class", "manufacturer", "hwy")]) # Scenario 2: pre-computed averages mpg_avg <- aggregate(hwy ~ class, data = mpg, FUN = mean) mpg_avg$hwy <- round(mpg_avg$hwy, 1) mpg_avg

Now use geom_bar() on the raw data, it counts how many cars fall in each class:

Rgeombar counts rows per class

# geom_bar: count rows per class automatically p_bar <- ggplot(mpg, aes(x = class)) + geom_bar(fill = "steelblue") + labs( title = "Number of Car Models by Class (geom_bar)", x = "Vehicle Class", y = "Count" ) p_bar

Now use geom_col() on the pre-computed averages, the bar height is the actual hwy value:

Rgeomcol plots precomputed averages

# geom_col: use pre-computed average highway mpg per class p_col <- ggplot(mpg_avg, aes(x = class, y = hwy)) + geom_col(fill = "tomato") + labs( title = "Average Highway MPG by Class (geom_col)", x = "Vehicle Class", y = "Avg Highway MPG" ) p_col

KEY INSIGHT: You can also use geom_bar(stat = "identity") as a substitute for geom_col(). They are equivalent. In modern ggplot2, geom_col() is the cleaner choice, it signals your intent clearly without the stat = "identity" override. But you will see both in the wild, so recognize them as the same thing.

Try it: Try adding a y aesthetic to geom_bar() without stat = "identity". What error does ggplot2 produce?

RTry it: Supply y to geombar

# Your code here, supply both x and y to geom_bar() and observe the result

Click to reveal solution

Rgeombar y-aesthetic error solution

ex_error <- tryCatch( ggplot(mpg, aes(x = class, y = hwy)) + geom_bar(), error = function(e) message("Error: ", e$message) ) #> Error: stat_count() can only have an x or y aesthetic.

geom_bar() defaults to stat = "count", which derives the y values by counting rows in each x bin, supplying your own y collides with that calculation. The error tells you exactly which stat is complaining. Either drop the y (and use geom_bar() for counts) or switch to geom_col() / geom_bar(stat = "identity") to use the supplied heights.

How do you create stacked and dodged bar charts?

When your data has a second categorical variable (like drive type within each vehicle class), the position argument controls how bars within each x-category are arranged.

Position options for grouped bars: dodge, stack, and fill.

Figure 2: Position options for grouped bars: dodge, stack, and fill.

Stacked bars (default) place sub-categories on top of each other, showing totals and composition simultaneously:

RStacked bar by drive type

# Stacked bar: fill = drv adds drive type as a second categorical dimension p_stack <- ggplot(mpg, aes(x = class, fill = drv)) + geom_bar(position = "stack") + scale_fill_brewer( palette = "Set2", labels = c("4" = "4WD", "f" = "Front", "r" = "Rear") ) + labs( title = "Car Count by Class and Drive Type (Stacked)", x = "Vehicle Class", y = "Count", fill = "Drive Type" ) p_stack

Dodged bars place sub-categories side-by-side, making it easier to compare values within each group:

RDodged bar side by side

# Dodged bar: sub-categories placed side by side p_dodge <- ggplot(mpg, aes(x = class, fill = drv)) + geom_bar(position = position_dodge(preserve = "single")) + scale_fill_brewer( palette = "Set2", labels = c("4" = "4WD", "f" = "Front", "r" = "Rear") ) + labs( title = "Car Count by Class and Drive Type (Dodged)", x = "Vehicle Class", y = "Count", fill = "Drive Type" ) p_dodge

preserve = "single" keeps all bars the same width even when some groups have fewer sub-categories. Without it, bars in sparse groups stretch wider to fill space.

Percent-stacked bars normalize each stack to 100%, letting you compare proportions across groups regardless of total count:

RPercent-stacked drive type mix

p_fill <- ggplot(mpg, aes(x = class, fill = drv)) + geom_bar(position = "fill") + scale_y_continuous(labels = scales::percent) + scale_fill_brewer( palette = "Set2", labels = c("4" = "4WD", "f" = "Front", "r" = "Rear") ) + labs( title = "Drive Type Mix by Vehicle Class (Percent Stacked)", x = "Vehicle Class", y = "Proportion", fill = "Drive Type" ) p_fill

TIP: Use stacked when the total height matters (e.g., total sales volume by region). Use dodged when you need to compare sub-group values directly (e.g., sales per quarter by product). Use percent-stacked when the mix/proportion matters more than the absolute count (e.g., market share over time).

Try it: Change position = "fill" to position = "stack" in p_fill. How does the chart interpretation change?

RTry it: Switch fill to stack

# Your code here, swap "fill" for "stack" and drop the percent y-axis

Click to reveal solution

RStack switch solution

ex_stack_switch <- ggplot(mpg, aes(x = class, fill = drv)) + geom_bar(position = "stack") + scale_fill_brewer(palette = "Set2") ex_stack_switch

position = "stack" shows the raw counts per drive type, so each bar's total height tells you how many cars of that class are in the dataset, suv is now visibly the tallest. position = "fill" normalises every bar to 100%, which masks the count differences but makes the mix across classes directly comparable. Pick stack when totals matter, fill when proportions matter.

How do you reorder bars by value?

Bars in alphabetical order are almost never the right choice. A reader's eye moves from left to right, sorting by descending value puts the most important category first and makes comparisons effortless. The fct_reorder() function from the forcats package handles this.

RReorder bars by average MPG

# Sort vehicle classes by average highway mpg (ascending for coord_flip) mpg_avg$class_ordered <- fct_reorder(mpg_avg$class, mpg_avg$hwy) p_ordered <- ggplot(mpg_avg, aes(x = class_ordered, y = hwy)) + geom_col(fill = "steelblue") + labs( title = "Average Highway MPG by Vehicle Class (sorted)", x = "Vehicle Class", y = "Avg Highway MPG" ) p_ordered

fct_reorder(factor, numeric) reorders the levels of class by the values in hwy. Since bar charts read left-to-right by default, the leftmost bar is the smallest and the rightmost is the largest. When you flip to horizontal (next section), this naturally becomes top-to-bottom descending.

TIP: For frequency-sorted bars from raw data (using geom_bar()), use fct_infreq() instead: aes(x = fct_infreq(class)). It reorders the factor by count automatically, no pre-aggregation needed.

Try it: Change fct_reorder(mpg_avg$class, mpg_avg$hwy) to fct_reorder(mpg_avg$class, -mpg_avg$hwy) (note the minus sign). How does the bar order change?

RTry it: Sort bars descending

# Your code here, negate the sort key to flip the direction

Click to reveal solution

RDescending sort solution

ex_desc <- ggplot(mpg_avg, aes(x = fct_reorder(class, -hwy), y = hwy)) + geom_col(fill = "steelblue") + labs(x = "Vehicle Class", y = "Avg Highway MPG") ex_desc

Negating the sort key (-hwy) reverses the ordering, bars now read left-to-right from highest to lowest mpg. This is the natural reading direction for vertical bar charts because the eye lands on the most important value first. For horizontal bars (after coord_flip()), the ascending version is usually better since it puts the largest bar at the top.

How do you add labels to bar charts?

Labels on bars let readers read exact values without estimating from the axis. The positioning depends on whether you want labels inside, above, or at the end of each bar.

RAdd value labels above bars

# Add value labels above bars (vjust controls vertical position) p_label <- ggplot(mpg_avg, aes(x = fct_reorder(class, hwy), y = hwy)) + geom_col(fill = "steelblue", width = 0.7) + geom_text( aes(label = hwy), vjust = -0.4, # above the bar size = 3.5, color = "grey20" ) + scale_y_continuous(expand = expansion(mult = c(0, 0.12))) + labs( title = "Avg Highway MPG by Vehicle Class", x = "Vehicle Class", y = "Avg Highway MPG" ) p_label

expand = expansion(mult = c(0, 0.12)) adds 12% headroom above the tallest bar so the top labels don't get clipped. Without this, labels above the highest bar disappear.

For labels inside the bar (useful for long bars with enough room), change vjust and color:

RPlace labels inside the bars

# Labels inside the bars (good when bars are tall enough) p_label_in <- ggplot(mpg_avg, aes(x = fct_reorder(class, hwy), y = hwy)) + geom_col(fill = "steelblue", width = 0.7) + geom_text( aes(label = hwy), vjust = 1.6, # inside the bar, near top size = 3.5, color = "white", fontface = "bold" ) + labs(x = "Vehicle Class", y = "Avg Highway MPG") p_label_in

WARNING: Inside labels fail for short bars, the text overflows the bar boundary and becomes unreadable. Check your data range before committing to inside placement. A safe rule: use inside labels only when all bars are at least 30% of the max bar height.

Try it: Change vjust = -0.4 to vjust = 2 in p_label. Does the label move inside the bar? Does it still look readable?

RTry it: Push labels inside bar

# Your code here, push vjust positive and switch the text colour

Click to reveal solution

RInside labels solution

ex_label_pos <- ggplot(mpg_avg, aes(x = fct_reorder(class, hwy), y = hwy)) + geom_col(fill = "steelblue") + geom_text(aes(label = hwy), vjust = 2, color = "white", size = 3.5) ex_label_pos

vjust = 2 shifts the label down by two text-line heights, planting it inside the top of the bar, and switching to color = "white" keeps it readable against the steelblue fill. The trick only works while every bar is tall enough to contain the text; the shortest bars in this set get crowded, which is exactly the trade-off the WARNING above mentions.

How do you make a horizontal bar chart?

Horizontal bars are easier to read when category names are long. coord_flip() rotates the entire chart 90 degrees, x becomes y and vice versa. Apply it to any of the charts built so far:

RFlip to horizontal with coordflip

# Flip p_ordered to horizontal - long names are now easy to read p_horiz <- p_ordered + coord_flip() + labs( title = "Average Highway MPG by Class (horizontal)", x = NULL, # remove redundant y-axis label after flip y = "Avg Highway MPG" ) p_horiz

Because p_ordered was already sorted ascending by fct_reorder(), after flipping, the chart reads top-to-bottom from highest to lowest, the most natural reading direction for a ranked list.

KEY INSIGHT: In newer ggplot2 (3.3+), you can also achieve horizontal bars by swapping x and y in aes() directly, aes(y = class, x = hwy), without coord_flip(). The advantage is that axis labels stay in their natural orientation without flipping. The disadvantage is that fct_reorder() with ascending order now produces top-to-bottom descending without needing to flip, which can be confusing. Both approaches work, coord_flip() is slightly more intuitive for beginners.

Try it: Apply coord_flip() to p_fill (the percent-stacked chart). Does the horizontal layout make the proportion comparison easier?

RTry it: coordflip the percent stack

# Your code here, add coord_flip() to p_fill

Click to reveal solution

RFlipped percent stack solution

ex_horiz_fill <- p_fill + coord_flip() ex_horiz_fill

After coord_flip() the bars run left-to-right with class names listed down the y-axis, so long names like subcompact no longer need to be tilted to fit. The proportion segments now read as horizontal slices, which mirrors how the eye scans rows in a table, easier than comparing vertical sub-sections at a glance.

Common Mistakes and How to Fix Them

Mistake 1: Using geom_bar() with pre-computed data and a y aesthetic

❌ This produces an error because geom_bar() tries to count rows and conflicts with the supplied y variable:

RCommon mistake: y with geombar

# Wrong: geom_bar can't handle a y aesthetic by default ggplot(mpg_avg, aes(x = class, y = hwy)) + geom_bar()

✅ Use geom_col() for pre-computed values:

RCorrect: geomcol for precomputed values

ggplot(mpg_avg, aes(x = class, y = hwy)) + geom_col()

Mistake 2: Leaving bars in alphabetical order

❌ Alphabetical order rarely communicates a meaningful ranking. Readers can't immediately spot the highest or lowest value.

✅ Sort by value with fct_reorder(): aes(x = fct_reorder(class, hwy)) or use fct_infreq() for count data.

Mistake 3: Using color= instead of fill= for bar color

❌ geom_bar(color = "steelblue") colors only the outline of the bars, not the interior:

RCommon mistake: color outlines only

# Wrong: outlines only ggplot(mpg, aes(x = class)) + geom_bar(color = "steelblue")

✅ Use fill to color the bar interior:

RCorrect: fill for bar interior

ggplot(mpg, aes(x = class)) + geom_bar(fill = "steelblue")

Mistake 4: Labels getting clipped at the top

❌ Adding geom_text(vjust = -0.4) without expanding the y-axis clips labels above the tallest bar.

✅ Add scale_y_continuous(expand = expansion(mult = c(0, 0.12))) to add headroom above the highest bar.

Mistake 5: Dodged bars with unequal widths

❌ position = "dodge" by default makes narrow bars for sparse groups, categories with fewer sub-groups get wider bars.

✅ Use position = position_dodge(preserve = "single") to maintain consistent bar widths across all groups.

Practice Exercises

Exercise 1: Diamond cut bar chart

Using the diamonds dataset, create two bar charts side-by-side (use gridExtra::grid.arrange() or patchwork):

A bar chart of cut frequency using geom_bar(), bars sorted from highest to lowest count
A bar chart of average price per cut using geom_col(), bars sorted by price

Are the highest-frequency cuts also the most expensive?

RExercise: Diamonds cut frequency and price

# Starter code # library(patchwork) or library(gridExtra) # Chart 1: Frequency by cut # ggplot(diamonds, aes(x = fct_infreq(cut))) + geom_bar(fill = "steelblue") # Chart 2: Average price by cut # diamonds_avg <- aggregate(price ~ cut, diamonds, mean) # ggplot(diamonds_avg, aes(x = fct_reorder(cut, price, .desc = TRUE), y = price)) + # geom_col(fill = "tomato")

Exercise 2: Stacked vs percent-stacked comparison

Using the mpg dataset, create a stacked and a percent-stacked bar chart of class vs drv (drive type). Then answer: which vehicle classes are most dominated by front-wheel drive? Does the absolute count chart or the proportion chart make this clearer?

RExercise: Stacked and percent-stacked drv

# Starter: stacked counts # p1 <- ggplot(mpg, aes(x = class, fill = drv)) + # geom_bar(position = "stack") # Starter: percent stacked # p2 <- ggplot(mpg, aes(x = class, fill = drv)) + # geom_bar(position = "fill") + # scale_y_continuous(labels = scales::percent)

Complete Example

This final chart combines everything: pre-computed averages, sorted bars, data labels, a clean theme, and colorblind-friendly colors, ready for a report or presentation.

REnd-to-end top-10 manufacturer chart

# Pre-compute mean highway mpg per manufacturer (top 10 by mpg) mfr_avg <- aggregate(hwy ~ manufacturer, data = mpg, FUN = mean) mfr_avg$hwy <- round(mfr_avg$hwy, 1) mfr_top10 <- head(mfr_avg[order(-mfr_avg$hwy), ], 10) p_final <- ggplot( mfr_top10, aes(x = fct_reorder(manufacturer, hwy), y = hwy) ) + geom_col(fill = "#2166ac", width = 0.75) + geom_text( aes(label = hwy), hjust = -0.2, size = 3.5, color = "grey20" ) + coord_flip() + scale_x_discrete(labels = tools::toTitleCase) + scale_y_continuous( limits = c(0, 35), expand = expansion(mult = c(0, 0)) ) + labs( title = "Top 10 Manufacturers by Avg Highway MPG", subtitle = "Based on EPA fuel economy data (mpg dataset)", x = NULL, y = "Average Highway MPG", caption = "Source: ggplot2::mpg" ) + theme_minimal(base_size = 13) + theme( panel.grid.major.y = element_blank(), axis.text.y = element_text(face = "bold") ) p_final

element_blank() removes the horizontal grid lines, they're redundant when exact values appear as labels. The bold manufacturer names draw attention to the categories, not the bars themselves.

Summary

Task	Code
Count rows per category	`geom_bar()`
Use pre-computed heights	`geom_col()`
Stack sub-categories	`geom_bar(position = "stack")`
Side-by-side sub-categories	`geom_bar(position = position_dodge(preserve = "single"))`
Proportion (percent) stacked	`geom_bar(position = "fill")`
Sort by value	`aes(x = fct_reorder(var, value))`
Sort by frequency	`aes(x = fct_infreq(var))`
Labels above bars	`geom_text(aes(label = val), vjust = -0.4)`
Labels inside bars	`geom_text(aes(label = val), vjust = 1.6, color = "white")`
Horizontal bars	`+ coord_flip()`
Fill vs outline color	`fill = "color"` (interior) vs `color = "color"` (border)

Key rules:

Use geom_bar() for raw data (counts automatically), geom_col() for pre-aggregated data
Sort by value with fct_reorder(), alphabetical order is almost never informative
Add headroom for above-bar labels: scale_y_continuous(expand = expansion(mult = c(0, 0.12)))
Use fill (not color) to change bar fill color

FAQ

Can I use geom_bar() with stat = "identity" instead of geom_col()?

Yes, they are equivalent. geom_bar(stat = "identity") and geom_col() produce identical output. geom_col() was added to ggplot2 to make the intent clearer, use it when you have pre-computed values.

How do I change bar width?

Use the width argument: geom_col(width = 0.5). The default is 0.9 (90% of the space between x positions). Lower values create narrower bars with more white space between them.

How do I sort a stacked bar chart by one sub-group's size?

This requires reordering the factor before plotting. Compute the values for the sub-group you want to sort by, then use fct_reorder() on that subset's values. Alternatively, use fct_reorder2() for sorting by a two-variable function.

How do I add a reference line to a bar chart?

Use geom_hline(yintercept = value) for horizontal reference lines (or geom_vline() after coord_flip()). For example: + geom_hline(yintercept = mean(mpg_avg$hwy), linetype = "dashed", color = "grey50").

Why do my bars have gaps at the x-axis baseline?

By default, ggplot2 adds padding below the x-axis. To remove it: scale_y_continuous(expand = expansion(mult = c(0, 0.05))), the first value (0) removes padding at the bottom, the second (0.05) adds 5% headroom at the top.

References

Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer. https://ggplot2-book.org/
ggplot2 reference, geom_bar() and geom_col(). https://ggplot2.tidyverse.org/reference/geom_bar.html
forcats reference, fct_reorder(). https://forcats.tidyverse.org/reference/fct_reorder.html
Wilke, C. O. (2019). Fundamentals of Data Visualization, Chapter 6: Visualizing Amounts. https://clauswilke.com/dataviz/
R Graph Gallery, Bar Charts. https://r-graph-gallery.com/barplot.html
Healy, K. (2018). Data Visualization: A Practical Introduction, Chapter 4. https://socviz.co/

Continue Learning

ggplot2 Scatter Plots, explore relationships between two continuous variables with geom_point(), color mapping, and trend lines.
ggplot2 Distribution Charts, compare distributions with histograms, boxplots, and violin plots, a natural complement to bar charts when you need more than a single summary value per group.
ggplot2 Line Charts, track change over time with geom_line(), grouped by category and styled with linetypes.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

ggplot2 Bar Charts: geom_bar(), geom_col(), Stacked, Dodged and Ordered

Introduction

What is the difference between geom_bar() and geom_col()?

How do you create stacked and dodged bar charts?

How do you reorder bars by value?

How do you add labels to bar charts?

How do you make a horizontal bar chart?

Common Mistakes and How to Fix Them

Mistake 1: Using geom_bar() with pre-computed data and a y aesthetic

Mistake 2: Leaving bars in alphabetical order

Mistake 3: Using color= instead of fill= for bar color

Mistake 4: Labels getting clipped at the top

Mistake 5: Dodged bars with unequal widths

Practice Exercises

Exercise 1: Diamond cut bar chart

Exercise 2: Stacked vs percent-stacked comparison

Complete Example

Summary

FAQ

References

Continue Learning

Further Reading

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

ggplot2 Bar Charts: geom_bar(), geom_col(), Stacked, Dodged and Ordered

Introduction

What is the difference between geom_bar() and geom_col()?

How do you create stacked and dodged bar charts?

How do you reorder bars by value?

How do you add labels to bar charts?

How do you make a horizontal bar chart?

Common Mistakes and How to Fix Them

Mistake 1: Using geom_bar() with pre-computed data and a y aesthetic

Mistake 2: Leaving bars in alphabetical order

Mistake 3: Using color= instead of fill= for bar color

Mistake 4: Labels getting clipped at the top

Mistake 5: Dodged bars with unequal widths

Practice Exercises

Exercise 1: Diamond cut bar chart

Exercise 2: Stacked vs percent-stacked comparison

Complete Example

Summary

FAQ

References

Continue Learning

Further Reading

Related Tutorials