R hist() Error: 'breaks are not unique', Why Your Data Has No Spread
The error some 'breaks' are not unique means hist() built bin edges that contain duplicates, almost always because your data has zero or near-zero spread, so the computed boundaries collapse onto the same number.
Why does R throw "breaks are not unique"?
hist() picks bin edges from range(x) and the chosen breaks rule (Sturges by default). If every value in x is the same, diff(range(x)) is zero, so every edge lands on the same point and R refuses to draw overlapping bars. The block below reproduces the exact message with tryCatch() so you can compare it to what your console printed.
The error text is literal, R is telling you that after it built the break vector, two or more entries were equal. The diagnostic lines confirm why: the range is zero and there is only one unique value. Any fix has to either give hist() a range to work with, or switch to a chart that doesn't need one.
hist() is refusing to draw bins because the data has no spread to bin, the fix is to inspect the data, not to patch the plot call.Try it: Build a vector of 50 zeros named ex_zero, pass it to hist(..., plot = FALSE) inside tryCatch(), and capture the error message.
Click to reveal solution
Explanation: Any constant vector triggers the same error, the specific value (5, 0, 42) doesn't matter. What matters is that length(unique(x)) == 1.
How do you detect low-variance columns before calling hist()?
If you loop over a data frame and call hist() on each numeric column, one sick column will kill the whole loop. Three cheap checks catch the problem before hist() ever runs: var(x) > 0, length(unique(x)) > 1, and diff(range(x)) > 0. Running them with sapply() gives you a per-column report in two lines.
The constant column shows up with var = 0, n_unique = 1, and spread = 0, any one of those three is a reliable flag. In production code you only need one check; length(unique(x)) > 1 is the cheapest because it stops as soon as it finds a second distinct value.
if (length(unique(x)) > 1) hist(x) else message("skip: constant column") so one bad column doesn't stop the whole batch.Try it: Add a third column ex_tiny to df that contains 100 values from rnorm(100, mean = 50, sd = 1e-9). Will the length(unique) > 1 guard still clear it?
Click to reveal solution
Explanation: The guard clears it because R stores doubles with ~15 digits of precision, so 100 draws with sd = 1e-9 are almost all distinct. But the spread is tiny, the next section handles that case.
How do you fix constant or near-constant data?
Once you know a column has no useful spread, there are three practical paths. Use barplot(table(x)) when the data is truly constant, it's the honest chart. Use jitter() when you want to visualize tiny noise that's invisible at the default resolution. Use manual seq() breaks when you want hist() to draw a single bar around the constant value.
The barplot version is correct but boring, a single bar of height 100. The jittered version looks like a real histogram, but the spread is artificial. The manual-breaks version is the best compromise if you need a histogram in a panel of other histograms: the chart type stays consistent and readers see one tall bar centered on 5.
Try it: Use hist() with manual breaks = seq(4, 6, by = 0.5) on flat_data and confirm it renders without error.
Click to reveal solution
Explanation: As long as the break vector has at least two distinct values and brackets the data range, hist() is happy. The bar lands in the (4.5, 5] bin.
How do you fix duplicate manual or quantile breaks?
Even when your data has plenty of spread, you can still hand hist() a bad break vector. The two common failure modes are a hardcoded vector with a typo and a break vector generated from quantile() on tied data. Both trigger the same error, and both are fixed by sort(unique(...)), though the quantile version is usually a hint that barplot() is a better fit.
quantile() returned six 1s and five 2s because the 10th through 50th percentiles of the data are all exactly 1. unique() collapses the break vector to just c(1, 2) and hist() draws a single bin, technically correct but not informative. The barplot(table(tied)) version is usually what you actually wanted: two bars, one per discrete level.
cut() errors with the same message. If you're binning a variable with cut(x, breaks = quantile(x, ...)) and see 'breaks' are not unique, apply the same unique() rescue, or switch to cut(x, breaks = unique(...), include.lowest = TRUE).Try it: Given c(rep(1, 50), rep(2, 50)), build a quantile break vector at deciles, fix the duplicates, and count how many unique edges remain.
Click to reveal solution
Explanation: After unique() the vector holds only 1 and 2. Two edges make one bin, a sign that a barplot would communicate more than a histogram here.
Practice Exercises
Exercise 1: Build a safe hist() wrapper
Write my_safe_hist(x) that does two things. If length(unique(x)) < 2, print a one-line diagnostic with message() and return invisible(NULL). Otherwise, call hist(x, main = "safe hist") and return invisible(NULL). Test it on rep(7, 50) (should skip) and rnorm(200) (should plot).
Click to reveal solution
Explanation: The guard catches the constant case before hist() ever runs, so the function never throws the 'breaks' are not unique error. Using message() (not print()) keeps the diagnostic out of normal output streams.
Exercise 2: Column-wise dispatch
Given a data frame with four numeric columns, one constant, one near-constant, two normal, loop over columns and pick the right chart per column. Skip the constant one, jitter the near-constant one, plot the normal ones directly. Save each decision ("skip", "jitter", "plot") to a named character vector my_plot_log.
Click to reveal solution
Explanation: The loop uses two thresholds. length(unique(x)) < 2 catches truly constant columns; diff(range(x)) < 1e-4 catches near-constant ones that would draw a useless one-bar histogram at default settings. Each decision is logged so you can audit the plots later.
Complete Example
Here is the full pattern you would ship in a reporting pipeline, inspect each numeric column, pick a chart, plot it, and return a decision log.
The decisions vector is your audit trail. In a real pipeline you would log it alongside the plots so a reviewer can see why the flat column became a barplot and the noise column became a jittered histogram, without that log, an unexpected chart type looks like a bug instead of a deliberate choice.
Summary
| Symptom | Root cause | Fix |
|---|---|---|
| All values identical | Zero range → duplicate edges | barplot(table(x)) |
| Near-constant data, many requested breaks | Spread below bin resolution | Reduce breaks or jitter() |
Manual breaks = c(...) with a repeat |
Typo in the vector | sort(unique(breaks)) |
quantile()-derived breaks on tied data |
Tied values collapse quantiles | unique(quantile(...)), or barplot |
| Discrete integer data, narrow range | Few unique values | seq(min - 0.5, max + 0.5, by = 1) |
The common thread: the error is about your data, not your plot call. Inspect length(unique(x)) first; the right fix follows from the answer.
References
- R Core Team,
?histdocumentation, stats package reference manual. Link - Venables, W. N. & Ripley, B. D., Modern Applied Statistics with S, 4th Edition, Chapter 5: Graphics. Springer (2002).
- Wickham, H., ggplot2: Elegant Graphics for Data Analysis, 3rd Edition, Chapter on histograms and density plots. Link
- R source,
hist.default()implementation in thegraphicspackage. Link - R Core Team,
?cutdocumentation (same error shape, same fix). Link
Continue Learning
- R Common Errors, the full reference for
Error in ...messages you'll meet in base R. - R Error: singular matrix in solve(), another "your data is degenerate" message with a similar fix pattern.
- R Error in read.csv: more columns than column names, parsing errors that hit before you ever get to plot.