Treemap in R with treemapify and ggplot2
A treemap divides a rectangle into tiles — each tile represents one observation, and its area encodes a numeric value. In R, the treemapify package brings treemaps into the ggplot2 grammar with geom_treemap().
Introduction
When you have many categories and want to show their proportional sizes, a bar chart works well up to about 20-30 bars. Beyond that, the chart becomes a wall of color with labels you can barely read.
A treemap solves this by packing every category into a single rectangle. Bigger categories get bigger tiles — the proportions are immediately visible even with 50+ categories. Add a second variable as color, and you're encoding two dimensions in one compact chart.
The tradeoff: treemaps are harder to compare precisely. You can tell that one tile is roughly twice the size of another, but you can't read exact values without labels. They're a visualization for "big picture proportions," not precise comparisons.
The treemapify package integrates with ggplot2, so all your usual theme, scale, and annotation tools work exactly as expected.
How do you create a basic treemap in R?
treemapify works as a ggplot2 extension. Load both packages, build a data frame, and use geom_treemap() with an area aesthetic.
The area aesthetic is the only required aesthetic — it controls the tile size. fill maps a variable to tile color. Without labels, though, you can't tell which tile belongs to which company.
Try it: Remove fill = company and instead use fill = share with scale_fill_viridis_c(). This encodes the same information (market share) as both area AND color — double encoding that makes the largest tiles stand out even more.
How do you add text labels to a treemap?
geom_treemap_text() automatically resizes labels to fit inside each tile — small tiles get smaller text, large tiles get larger text. Labels that don't fit at all are hidden automatically.
grow = TRUE scales text up to fill the tile. reflow = TRUE wraps long labels across multiple lines. Set color = "white" for light text on colored tiles — adjust for light-colored tiles.
To also show the share value, use label = paste0(company, "\n", share, "%"):
Try it: Change label = company to aes(label = paste0(company, "\n", share, "%")) to show both the company name and percentage on each tile.
How do you color by a second variable?
Encoding a second numeric variable as color turns a treemap into a 2D visualization: area shows one metric, color shows another.
At a glance: Samsung has the largest market share (biggest tile) but is shrinking (reddish). Motorola has a small share but strong growth (green). Huawei is small and contracting (deep red).
Try it: Replace scale_fill_gradient2() with scale_fill_viridis_c(option = "plasma") for a different continuous scale. Which scale communicates the positive/negative split more clearly?
How do you create a hierarchical treemap with subgroups?
Real data often has a natural hierarchy — categories within sectors, products within brands. treemapify supports this with the subgroup aesthetic and geom_treemap_subgroup_border().
geom_treemap_subgroup_border() draws white borders around each region group. geom_treemap_subgroup_text() places region labels in the top-left corner of each subgroup area. The hierarchy is now visible: Asia-Pacific dominates, with Samsung and several Chinese brands; Americas has Apple and Motorola.
Try it: Add a second subgroup level with subgroup2 = sector (or another variable) and geom_treemap_subgroup2_border() to create a three-level hierarchy.
Complete Example: Polished Treemap
Common Mistakes and How to Fix Them
Mistake 1: Forgetting the area aesthetic
geom_treemap() requires area — without it, ggplot2 can't size the tiles.
Mistake 2: Using treemaps for time series or comparisons
Treemaps show proportions at a single point in time. They cannot show change over time (use a line chart) or allow precise side-by-side comparison (use a bar chart). If your audience needs to read exact values, add labels — or use a bar chart instead.
Mistake 3: Too many categories with tiny tiles
When a category's share is very small, its tile becomes too small to show a label. Fix: collapse small categories into "Others" or use a bar chart that can accommodate small differences better.
Mistake 4: Not matching label color to fill contrast
Mistake 5: Encoding the same variable as both area and fill
Double encoding (same data as both size and color) isn't wrong per se, but it wastes a visual channel. Better: use fill for a different variable to give readers more information.
Practice Exercises
Exercise 1: GNP treemap
The GNP column from R's built-in longley dataset contains economic data. Create a treemap visualizing the GNP values from the longley dataset with year as the label.
Show solution
Exercise 2: mtcars sector treemap
Create a hierarchical treemap using mtcars where:
area=hp(horsepower)subgroup= number of cylinders (cyl)- Label each car with its row name
- Color by
mpg(fuel efficiency)
Show solution
Summary
| Geom | Purpose |
|---|---|
geom_treemap() |
Draw tiles sized by area aesthetic |
geom_treemap_text() |
Auto-sized labels inside tiles |
geom_treemap_subgroup_border() |
Borders around subgroups |
geom_treemap_subgroup_text() |
Labels for subgroup areas |
| Key aesthetic | Maps to |
|---|---|
area |
Tile size (required) |
fill |
Tile color (a category or continuous variable) |
label |
Text label inside the tile |
subgroup |
Hierarchical grouping |
When to use treemaps:
- Many categories (20-100+) where a bar chart would be too long
- Showing proportions at a single point in time
- Two metrics to encode simultaneously (area + color)
When to use bar charts instead:
- Precise comparisons matter
- Fewer than 20 categories
- Time series or ranking changes over time
FAQ
Is treemapify on CRAN? Yes — install.packages("treemapify"). It depends on ggplot2 and ggfittext (for auto-sizing text).
How do I control the treemap layout algorithm? geom_treemap() accepts a layout argument: "squarified" (default, best aspect ratios), "scol" (column-oriented), "srow" (row-oriented). Squarified is almost always the best choice.
Can I add more than one level of subgroup? Yes — up to three levels: subgroup, subgroup2, subgroup3. Each level has corresponding border and text geoms (e.g., geom_treemap_subgroup2_border()).
Why do some labels disappear? geom_treemap_text() drops labels that can't fit in the tile even at minimum size. For small tiles, either increase min.size (minimum font size before hiding) or reduce the number of categories.
Can I use facets with treemapify? Yes — facet_wrap() and facet_grid() work with treemapify geoms, creating separate treemaps per facet panel. Useful for comparing structure across time periods or groups.
References
- treemapify package documentation: wilkox.org/treemapify/
- Shneiderman B. (1992). Tree visualization with tree-maps. ACM Transactions on Graphics.
- Wilke C. (2019). Fundamentals of Data Visualization — Chapter 11: Visualizing nested proportions
- r-charts.com — Treemaps with treemapify
What's Next?
- ggplot2 Bar Charts — precise categorical comparisons when you have fewer categories
- Pie Chart and Donut Chart in R — part-to-whole for 3-5 categories
- R Waffle Chart — encode counts as grids of unit squares