Diagnostic Plot Interpreter
R's plot.lm() produces four diagnostic plots that tell you whether your linear model's assumptions hold. Paste residuals, fitted values, and leverage to get a per-plot verdict on heteroscedasticity, nonlinearity, normality, and influential points, with a clear recap of what each issue means and what to do about it.
New to regression diagnostics? Read the 4-min primer ▾
What diagnostic plots check. A linear model rests on four assumptions: the relationship is linear, the residuals have constant variance, they are roughly normal, and observations are independent. R's plot(lm(...)) emits four panels designed to expose the first three (independence is harder to read off a single picture). The job of those panels is not to compute a p-value; it is to let your eyes catch shapes that formal tests sometimes miss.
The four key checks. Residuals vs Fitted looks for a smooth curve away from zero (nonlinearity). Normal Q-Q looks for points that bend off the diagonal (non-normal residuals, often heavy tails). Scale-Location plots the square root of the standardised residual against fitted; a fanning trend means variance grows with the mean. Residuals vs Leverage highlights points whose Cook's distance is large enough to dominate the fit.
How to read each plot. Boring is good. A flat band of points around zero with no shape, a Q-Q line that hugs the diagonal, a flat scale-location smoother, and no points stranded in the upper-right corner of the leverage plot mean the assumptions are plausible. Trumpet shapes, parabolas, S-shaped Q-Q curves, and isolated high-leverage points are the warning signs the human eye is good at and a single test statistic often misses.
What to do when a check fails. Heteroscedasticity calls for a log or square-root transform of y, weighted least squares, or robust (sandwich) standard errors. Nonlinearity calls for a polynomial or spline term, or an interaction you forgot. Non-normal residuals at moderate n are usually fine because of the central limit theorem; at small n consider a transform or a generalised linear model. Influential points should be inspected by hand, never auto-deleted.
Try a real-world example to load.
# Fit the model and inspect the four standard diagnostic plots
fit <- lm(mpg ~ wt + hp, data = mtcars)
par(mfrow = c(2, 2))
plot(fit)
# Formal tests behind the verdicts
library(lmtest)
bptest(fit) # Breusch-Pagan: heteroscedasticity
resettest(fit) # RESET: nonlinearity
shapiro.test(residuals(fit)) # normality of residuals
# Influence: which rows have Cook's d > 4/n ?
which(cooks.distance(fit) > 4 / nrow(mtcars))Read more How each verdict is computed
n * R^2 from the auxiliary fit is asymptotically chi-square with one degree of freedom under constant variance. p < 0.05 flags a problem; 0.05 - 0.10 is borderline.y_hat, y_hat^2, y_hat^3; if the quadratic and cubic terms are jointly significant, the linear model is misspecified.W close to 1 means normality; small W with a small p-value means the tails or the skew are off. The Royston 1992 approximation supports n up to about 5,000.hatvalues(fit), the tool computes D directly; otherwise it uses leverage proxied by the position of the fitted value, which is rough but useful for screening. Always confirm with cooks.distance(fit).Caveats When this is the wrong tool
- If you have…
- Use instead
- A glm() with non-Gaussian family
- Use deviance residuals, not raw
residuals(fit); passresiduals(fit, type = "deviance")andpredict(fit, type = "link"). Shapiro-Wilk on deviance residuals is informational, not a formal test. - Time-series or autocorrelated residuals
- None of these checks see autocorrelation. Run
lmtest::dwtest(fit)for serial correlation, or inspectacf(residuals(fit)); switch to ARIMA / GLS if the residuals are not independent. - A mixed-effects model (lme4 / nlme)
- Marginal residuals violate independence by construction. Use the conditional residuals from
residuals(fit, type = "pearson")and inspect by group;performance::check_modelhandles the bookkeeping. - Small n (< 15)
- All four verdicts have low power at small n; the Shapiro-Wilk approximation also degrades. Trust the eye on the mini Q-Q panel and bring domain knowledge in.
- You only have the printed plot, not the data
- The tool reads numbers, not pixels. The lm() output interpreter and the diagnostic-plot tutorials linked below cover the visual cues directly.
- Linear regression in R, end-to-end - lm(), summary, diagnostics, reporting in one tutorial.
- Linear regression assumptions - linearity, normality, equal variance, independence.
- Regression diagnostics - residual plots, leverage, Cook's distance.
- Outlier treatment in R - identifying and handling extreme observations.
- Influential observations and leverage - Cook's d, DFFITS, hat values.
- lm() output interpreter - read every coefficient and the model-level statistics.
- glm() output interpreter - coefficient, deviance, and link-function read for non-Gaussian fits.
Numerical notes: Breusch-Pagan and RESET are approximate; the canonical implementations live in lmtest::bptest and lmtest::resettest. Shapiro-Wilk uses Royston's 1992 polynomial approximation; for n > 5,000 prefer Anderson-Darling. Cook's distance is exact when the third column is the true hatvalues(fit); otherwise the proxy is for screening only.