caret RMSE() in R: Root Mean Squared Error for Regression
The RMSE() function in caret computes root mean squared error: the square root of the average squared difference between predicted and observed numeric values. It returns one number in the units of the outcome, penalises large misses quadratically, and is the default selection metric for regression inside caret::train().
RMSE(pred, obs) # basic call: returns one number RMSE(pred, obs, na.rm = TRUE) # drop NAs in either vector caret::RMSE(pred, obs) # namespaced when caret not attached sqrt(mean((pred - obs)^2)) # equivalent base R postResample(pred, obs)["RMSE"] # RMSE inside the full metric set RMSE(predict(fit, newdata = te), te$y) # score a fitted regression model sapply(fold_preds, function(p) RMSE(p, obs)) # per-fold RMSE in resampling
Need explanation? Read on for examples and pitfalls.
What RMSE() does in one sentence
RMSE() returns the square root of the mean of squared residuals. You pass two numeric vectors of equal length and get back one number with the same units as the outcome. There is no model object, no formula, and no resampling logic; the function exists so you can score a vector of predictions in a single line and compare models on a common scale.
Internally the call is sqrt(mean((pred - obs)^2)) with an optional na.rm. caret exposes RMSE at the top level so it can also drop into defaultSummary() and train() resampling, where it is the default metric for regression.
RMSE() syntax and arguments
The signature is three arguments, two of them mandatory. Both vectors must be numeric and the same length. Mismatched lengths trigger a recycling warning and a meaningless result, so check length(pred) == length(obs) before scoring.
The arguments are pred (predictions), obs (the truth), and na.rm (default FALSE). The order is pred first, consistent with the rest of caret but reversed from Metrics::rmse(actual, predicted); name the arguments if you switch between packages.
RMSE is bounded below by zero and unbounded above. Zero means every prediction matched exactly. There is no upper benchmark: compare RMSE to the standard deviation of the outcome, to a baseline predictor, or to a competing model on the same hold-out set.
caret::RMSE(pred, obs) is sklearn.metrics.mean_squared_error(y_true, y_pred, squared = FALSE) or, from sklearn 1.4, the dedicated root_mean_squared_error(y_true, y_pred). sklearn takes truth first; caret takes predictions first.RMSE() examples by use case
Most calls fall into four patterns: a quick vector score, a hold-out test score, a per-fold cross-validation score, and side-by-side model comparison. Each example uses RMSE in the role it does best: penalising big regression misses on a common scale.
An RMSE of 2.87 on mtcars means the linear model is typically off by about 2.9 mpg, but with extra weight on the test cars where it missed worst. Always pair RMSE with a one-line sd(te$mpg) for context: an RMSE smaller than the test SD shows the model is doing better than always predicting the mean.
metric = "RMSE" is the default for regression, but writing it makes the choice obvious to a reviewer. Lower is better, and caret minimises RMSE automatically (no maximize = FALSE needed; minimisation is the default for RMSE). The $resample slot exposes the per-fold values so you can plot them or take a confidence interval.
Side-by-side RMSE on the same hold-out set is the cleanest regression comparison: same units, same observations, same metric. The smaller model wins despite having fewer predictors, the signature of mild overfitting in the larger one.
RMSE vs MAE vs MAPE: which to report
Pick the metric whose penalty matches how your stakeholder feels about errors. All three measure regression error but weight it differently.
| Metric | Formula (mean of...) | Units | Outlier weight | Best when |
|---|---|---|---|---|
| RMSE | squared residuals, then sqrt | outcome units | Quadratic | Large misses are disproportionately costly |
| MAE | absolute residuals | outcome units | Equal | Every miss is equally bad; stakeholder-friendly |
| MAPE | absolute percent residuals | percent | Variable | Outcome magnitudes vary widely across rows |
RMSE answers "how bad are my worst predictions on average." MAE answers "on a typical row, how far off am I." MAPE answers "what fraction of the truth do I miss." Pick RMSE when extreme errors are catastrophic; pick MAE when you need a number you can explain to a product manager in one sentence.
RMSE / MAE quantifies error skew. A ratio near 1 means residuals are uniform; a ratio above 1.5 means a few large misses are driving RMSE up. Showing both numbers in the same row of your report gives readers more signal than either alone, and a sudden ratio jump in a refreshed dataset is an early warning that outliers have changed.Common pitfalls
Three mistakes show up repeatedly in RMSE workflows. Each has a one-line fix.
Without na.rm = TRUE, a single NA in either vector wipes out the score. Drop or impute missing values explicitly before scoring, or pass na.rm = TRUE; do not assume the test set is clean just because train() succeeded.
The numbers happen to agree here because the inner mean is 1, but in general MSE = RMSE^2. They have different units: MSE is in squared units of the outcome, RMSE is back in the original units. Always report RMSE for human consumption; MSE is a calculator-stage value.
RMSE is in the outcome's units, so the same number means very different things on different targets. Normalise by the outcome's standard deviation (NRMSE = RMSE / sd(obs)) or by its range before comparing models across datasets.
Try it yourself
Try it: Compute RMSE for a linear model predicting Petal.Length from Petal.Width on the iris dataset, using a 70/30 split. Save the value to ex_rmse.
Click to reveal solution
Explanation: createDataPartition() stratifies on the outcome so train and test distributions match. With a single strong predictor (Petal.Width), RMSE on iris hovers near 0.47; adding Sepal.Length typically drops it below 0.4.
FAQ
What is a good RMSE value in R?
There is no universal threshold. A good RMSE is small relative to the spread of the outcome on the test set. Compare it to sd(obs), to a baseline (always predicting mean(obs)), or to a competing model's RMSE on the same hold-out rows. Rule of thumb: RMSE under half the test SD is a usable model; RMSE at or above the SD means the model has not learned anything useful.
How is RMSE different from MAE in caret?
Both RMSE() and MAE() summarise residuals in the outcome's units, but RMSE squares residuals before averaging and then takes the square root. The squaring penalises large misses much more heavily, so RMSE is always greater than or equal to MAE for the same data. The ratio RMSE / MAE flags outliers in the residual distribution: a ratio near 1 means errors are uniform, a ratio above 1.5 signals a few large misses driving RMSE up.
Does caret RMSE handle NA values?
Yes, but only when you ask. The default is na.rm = FALSE, so any NA in pred or obs propagates and the result is NA. Pass na.rm = TRUE to drop pairs where either side is missing, or impute upstream so the test set has no gaps.
Why does caret pick RMSE as the default metric?
caret::train() calls defaultSummary() per fold, which returns RMSE, R-squared, and MAE for regression. RMSE drives tuning by default because it is differentiable, in the outcome's units, and penalises large misses. Override with metric = "MAE" or metric = "Rsquared" if RMSE is the wrong target for your use case.
Can RMSE be used for classification in caret?
No. RMSE is a regression metric; it requires numeric inputs. For factor outcomes, use caret::confusionMatrix(pred, obs) for accuracy and kappa, or postResample(pred, obs) which switches metric sets based on input type. Calling RMSE() on factors throws an error before any computation runs.
Related caret functions
- caret::MAE(): absolute-error sibling, equal weight per miss
- caret::R2(): proportion of variance explained
- caret::postResample(): RMSE, R-squared, and MAE in one call
- caret::defaultSummary(): the default summary function used inside
train() - caret::train(): fit and tune regression models with RMSE as the selection metric
For the official function reference, see the caret package documentation on CRAN.