Rr‑statistics.co

Outlier Detection

Outliers can quietly distort means, standard deviations, and regression slopes. Grubbs, ESD, Hampel (MAD), and Tukey IQR each flag suspicious values with different rules. Paste your data, pick a method, and see exactly which points get flagged, by what statistic, and at what alpha level.

Flag values that drift far from the bulk of your data using Grubbs, Generalized ESD, the Hampel (MAD) filter, or the Tukey IQR rule. Reproducible R code, runs in your browser.

iWhat is an outlier?

An outlier is a value that sits unusually far from the centre of a sample relative to its spread. The word unusually hides a choice: a method, a threshold, and an assumption about the underlying distribution.

Parametric tests like Grubbs and Generalized ESD assume the bulk of the data is normally distributed and use the studentized residual against a critical value. They are powerful when the assumption holds and oversensitive when it does not.

Robust rules like the Hampel filter (median absolute deviation) and the Tukey IQR rule use rank-based dispersion measures that the outliers themselves cannot inflate. They make weaker assumptions and accept that they may flag a few extra points in heavy-tailed but legitimate data.

None of these tests answer should this point be removed. They answer does this point look unusual under this model. Removal is an editorial decision; this tool flags and explains, never auto-deletes.

Try a real-world example to load.

Outliers detected
0
No values flagged at the current threshold.
Per-test summary
Flagged values
  • none
R Reproducible code
# Outlier detection in R
library(outliers)

x <- c(2, 3, 3, 4, 4, 5, 5, 6, 7, 30)

# Grubbs (single outlier, two-sided)
grubbs.test(x, two.sided = TRUE)

# Generalized ESD (manual loop, k = max outliers)
esd <- function(x, k = 5, alpha = 0.05) {
  n <- length(x); xx <- x; idx <- seq_along(x)
  out <- integer(0)
  for (i in 1:k) {
    m <- mean(xx); s <- sd(xx); ni <- length(xx)
    R <- max(abs(xx - m)) / s
    p <- 1 - alpha / (2 * ni)
    t <- qt(p, ni - 2)
    lam <- (ni - 1) * t / sqrt(ni * (ni - 2 + t^2))
    if (R > lam) {
      j <- which.max(abs(xx - m))
      out <- c(out, idx[j]); xx <- xx[-j]; idx <- idx[-j]
    }
  }
  out
}
esd(x, k = 5)

# Hampel filter (median absolute deviation, k = 3)
mads <- abs(x - median(x)) / mad(x)
which(mads > 3)

# Tukey IQR rule (1.5 * IQR fences)
boxplot.stats(x)$out
Plot Dot plot with thresholds
Each dot is one observation. Red dots are flagged; dashed lines are method thresholds.
Flagged: 0 · n=10 · method: Grubbs
0.05
3.0
Inference

Read moreAnatomy of each test
G = max |x - mean(x)| / sd(x)
Grubbs. Flag the most extreme studentized residual against a critical value derived from the t-distribution: G > ((n-1)/sqrt(n)) * sqrt(t² / (n-2+t²)) with t = qt(1 - α/(2n), n-2). Tests one outlier per call; iterate manually if needed. Assumes normality.
R_i = max |x - mean| / sd (iterative)
Generalized ESD. Rosner's extension. Compute the Grubbs statistic, drop the most extreme value, recompute on the reduced sample, repeat k times. Compare each against a Bonferroni-style critical value. Detects up to k outliers without prior knowledge of how many.
|x - median(x)| / MAD(x) > k
Hampel filter. Replace mean and SD with the median and the median absolute deviation (MAD). Default threshold k = 3 approximates a 3-sigma rule for normal data because 1.4826 * MAD ≈ sd. Robust because the outliers cannot inflate the dispersion measure.
x < Q1 - k*IQR or x > Q3 + k*IQR
Tukey IQR. Standard whisker rule: k = 1.5 for “mild” outliers, k = 3 for “extreme”. Distribution-free, well-suited to skewed data, but tends to over-flag in small samples and in long-tailed but legitimate distributions.
CaveatsWhen each method goes wrong
Failure mode
What to do
Grubbs on heavy-tailed data
Over-rejects. Use Hampel or IQR instead, or apply a log transform.
ESD with k too low
Masks real outliers (the masking effect). Set k generously; the iterative test handles the surplus.
Hampel on data with zero MAD
MAD is zero when more than half the values are identical. The tool falls back to no detection and warns.
Tukey IQR on small n
Q1 and Q3 are unstable for n < 10. Prefer Grubbs at small sample sizes.
Removing without thought
An outlier is a question, not a verdict. Investigate the source before deletion.
Further readingRelated calculators & posts