Ridgeline Plot in R: Compare Many Distributions with ggridges
A ridgeline plot stacks density curves vertically — one per group — letting you compare many distributions at once without the clutter of overlapping violin plots. The ggridges package brings this chart type to ggplot2 with a single geom and a clean API.
Introduction
Once you have more than 5-6 groups to compare, violin plots become a wall of shapes that's hard to scan. Ridgeline plots (sometimes called joy plots) solve this by stacking the density curves vertically with slight overlap between rows — the mountain ridge shape that gives them their name.
Each curve shows the same information as a density plot — where values cluster, how spread they are, whether the distribution is symmetric or skewed — but the stacked layout lets you read down the page and compare groups naturally, the way you would scan a table.
The ggridges package by Claus Wilke integrates seamlessly with ggplot2. You replace geom_density() with geom_density_ridges() and add a y aesthetic that maps to your grouping variable — everything else follows the ggplot2 grammar you already know.
In this tutorial you will learn:
- How to draw a basic ridgeline plot with
geom_density_ridges() - How to color ridges by group or use gradient fills
- How to add quantile lines and jitter points
- How to adjust bandwidth for smoother or more detailed curves
- When ridgeline plots are the right choice over violin plots
How Does geom_density_ridges() Stack Distributions?
geom_density_ridges() draws a density curve for each level of the y aesthetic, stacked vertically from bottom to top. The x aesthetic is the continuous variable whose distribution you're showing; the y aesthetic is the grouping factor.
theme_ridges() is a minimal theme from the ggridges package — it removes the x-axis grid lines and adjusts spacing to complement the stacked layout. You can also use theme_minimal() or any ggplot2 theme.
The curves overlap slightly by default — this overlap is controlled by the scale parameter (not to be confused with ggplot2 scale functions). scale = 1 means no overlap; scale = 2 means the tallest peak of each curve reaches the baseline of the next group above it.
reorder(class, hwy, FUN = median) sorts the y-axis by median highway MPG — so the most fuel-efficient class sits at the top and the least efficient at the bottom. rel_min_height = 0.01 trims the long thin tails of each ridge where the density is less than 1% of the peak height.
KEY INSIGHT: Sort the y-axis by a meaningful statistic (median, mean, or range). Alphabetical order hides patterns — sorted order lets you immediately see which group is highest, which is lowest, and whether groups form natural clusters.
Try it: Change FUN = median to FUN = mean. Does the group ordering change significantly?
How Do You Color Ridges by Group or Apply Gradient Fills?
The simplest coloring strategy maps the grouping variable to fill — each ridge gets a distinct color:
For a more sophisticated look, use a gradient fill where the color within each ridge encodes the x-value magnitude. The ggridges fill aesthetic supports this with fill = after_stat(x) — colors shift from cool to warm as x increases:
geom_density_ridges_gradient() is a variant of geom_density_ridges() specifically designed for gradient fills — it splits each ridge into many thin vertical slices, each colored by its x-position. after_stat(x) maps the computed x value (from the density estimation) to the fill aesthetic.
TIP: Gradient fills are visually striking but encode the x-variable twice — once on the horizontal axis and again as color. This is redundant but it draws attention to the distribution shape and makes the chart more memorable. Use it when the chart is standalone (a report cover, a presentation slide) rather than in dense analytical dashboards.
Try it: Change option = "plasma" to option = "magma" in scale_fill_viridis_c(). How does the color temperature change?
How Do You Add Quantile Lines and Jitter Points?
stat_density_ridges() is the underlying stat for ridgeline density computation. It accepts a quantile_lines = TRUE argument that draws vertical lines at specified quantiles across each ridge — a quick way to show where the median and quartiles fall without an embedded boxplot.
For small datasets, showing individual data points on top of the ridgeline gives readers raw data context. Set jittered_points = TRUE directly in geom_density_ridges():
position_raincloud() places jitter points below the density ridge rather than inside it — the "raincloud" layout that shows both the cloud (density) and the rain (data points) in a compact arrangement.
WARNING:
jittered_points = TRUEworks well only for small to medium datasets (under ~200 points per group). With large datasets, the points form dense bands that obscure the density curve they're supposed to annotate. For large data, use the density ridge alone.
Try it: Remove position = position_raincloud(...) from p_jitter. How does the position of the jitter points change?
When Should You Use a Ridgeline Plot Instead of a Violin Plot?
This is the most practical question about ridgeline plots. Both show distribution shape, so the choice comes down to number of groups and the direction of comparison.
| Situation | Best Choice | Reason |
|---|---|---|
| 2-5 groups | Violin plot | Side-by-side violins are easier to compare at low count |
| 6-15 groups | Ridgeline plot | Stacked layout avoids a wide, cluttered chart |
| 15+ groups | Ridgeline plot (or faceted density) | Violins become unreadable at this scale |
| Comparing across time (months, years) | Ridgeline plot | Temporal ordering reads naturally top-to-bottom |
| Showing bimodal distributions clearly | Either — but ridgeline may show peaks more clearly | More horizontal space per curve in ridgeline |
| Embedding in a dashboard or tight layout | Violin plot | More compact width for a few groups |
The built-in lincoln_weather dataset from ggridges is a classic ridgeline example — 12 months of temperature data, where the stacked layout makes seasonal progression immediately readable:
The seasonal pattern jumps out immediately: cold, narrow distributions in winter months (tight cluster of low values); warm, wider distributions in summer (broader spread of higher values). A violin plot of 12 groups would be a visual mess.
KEY INSIGHT: Ridgeline plots shine when the order of groups carries meaning — time series (months, years), ranked categories (score bands), or any sequence where reading top-to-bottom tells a story. When groups are unordered, violin plots or boxplots are usually better.
Try it: Replace scale_y_discrete(limits = rev) with its removal (delete the line). Does January or December now appear at the top?
Common Mistakes and How to Fix Them
Mistake 1: Using ridgeline plots with too few groups
❌ A ridgeline plot with 2-3 groups wastes vertical space and is harder to compare than side-by-side violin plots or faceted density plots.
✅ Use ridgeline plots when you have at least 5-6 groups. For fewer groups, geom_violin() or geom_density() with facet_wrap() is cleaner.
Mistake 2: Leaving groups in alphabetical or arbitrary order
❌ Alphabetical group ordering hides any meaningful ranking and forces readers to mentally reorder the data.
✅ Sort by a meaningful statistic: reorder(group, x, FUN = median) sorts by median. For time-ordered groups (months, years), use factor(month, levels = month.name) to fix the calendar order.
Mistake 3: Setting scale too high — ridges cover earlier ones
❌ scale = 5 makes tall peaks from one row cover the labels or curves of the row above, creating overlapping spaghetti.
✅ Start with scale = 1 (no overlap) and increase gradually. For most datasets, scale = 1.5 to 2 looks clean without excessive overlap.
Mistake 4: Using gradient fill with too many groups
❌ geom_density_ridges_gradient() with 15+ groups renders slowly and can produce a visually overwhelming chart.
✅ For large group counts, use a single color with fill = group and scale_fill_viridis_d() for discrete colorblind-safe colors.
Mistake 5: Forgetting rel_min_height trims long tails
❌ Without rel_min_height, density curves extend into very long thin tails that visually suggest data exists far outside the actual range.
✅ Set rel_min_height = 0.01 to trim any part of the density curve that falls below 1% of the peak — this keeps the plot clean without losing meaningful information.
Practice Exercises
Exercise 1: Monthly airline passenger distributions
Using the built-in AirPassengers dataset, convert it to a data frame with month and passengers columns. Create a ridgeline plot of passenger count by month (sorted January at top to December at bottom). Use a gradient fill with scale_fill_viridis_c(option = "viridis").
Exercise 2: Compare ridgeline vs violin for diamonds data
Using diamonds, create both a ridgeline plot and a violin plot of price by cut. Then compare: which chart makes it easier to see that "Premium" cut has a very wide price range while "Ideal" cut clusters more tightly? Which layout is more compact?
Complete Example
The lincoln_weather dataset from ggridges (already shown in the comparison section) is the canonical ridgeline example. Here's a fully polished version with annotation:
Summary
| Task | Code |
|---|---|
| Basic ridgeline | geom_density_ridges() |
| Control overlap | geom_density_ridges(scale = 1.5) |
| Trim long tails | geom_density_ridges(rel_min_height = 0.01) |
| Fill by group | aes(fill = group) + scale_fill_brewer() |
| Gradient fill | geom_density_ridges_gradient() + aes(fill = after_stat(x)) |
| Quantile lines | stat_density_ridges(quantile_lines = TRUE, quantiles = c(0.25, 0.75)) |
| Jitter points | geom_density_ridges(jittered_points = TRUE) |
| Raincloud layout | position = position_raincloud() |
| Sort by median | y = reorder(group, x, FUN = median) |
| Clean theme | theme_ridges() |
When to use ridgelines:
- 5+ groups — ridgeline is more readable than side-by-side violins
- Time-ordered groups — months, years, age bands read naturally top-to-bottom
- Showing distributional shift across categories — each ridge's shape and position tells the story at a glance
FAQ
Does ggridges need to be installed separately from ggplot2?
Yes. Install once with install.packages("ggridges"), then load in each session with library(ggridges). The ggridges package is on CRAN and maintained by Claus Wilke.
What is the difference between scale and bandwidth in geom_density_ridges()?
scale controls the height/overlap between ridges — how much each ridge extends into the row above it. bandwidth (set via the bw argument) controls the smoothness of the kernel density estimate within each ridge — similar to adjust in geom_violin(). These are independent settings.
How do I draw a ridgeline plot with a discrete x-axis?
geom_density_ridges() is designed for continuous x. For discrete x (counts per category per group), use a heatmap (geom_tile()) or a faceted bar chart instead.
Can I use a cyclic y-axis (e.g., months wrapping from December back to January)?
Not natively in ggridges. The y-axis is a standard discrete scale. For cyclical time data, use factor(month, levels = month.name) to set calendar order, then let the year wrap at the chart boundary.
How do I make ridgelines fill to the baseline (no transparency below the curve)?
By default, ridgeline density curves are filled above the x-axis. Set alpha = 1 and color = "white" for fully opaque fills with visible edges between ridges.
References
- Wilke, C. O. ggridges package documentation. https://wilkelab.org/ggridges/
- Wilke, C. O. (2019). Fundamentals of Data Visualization, Chapter 9: Visualizing Many Distributions. https://clauswilke.com/dataviz/
- ggridges CRAN page and vignettes. https://cran.r-project.org/package=ggridges
- Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer. https://ggplot2-book.org/
- R Graph Gallery — Ridgeline Charts. https://r-graph-gallery.com/ridgeline-plot.html
What's Next?
- ggplot2 Distribution Charts — the complete guide to histograms, density plots, boxplots, and violin plots — the foundation that ridgeline plots extend.
- Violin Plot in R — similar distribution visualization with a different emphasis; better for 2-5 groups with embedded boxplots.
- R Color Theory — apply gradient fills and colorblind-safe palettes (like viridis) to ridgeline and other charts.