yardstick huber_loss() in R: Robust Regression Loss
The yardstick huber_loss() function in R scores a regression model with a loss that is squared for small residuals and linear for large ones, giving you the gradient behaviour of RMSE near the centre and the outlier tolerance of MAE in the tail.
huber_loss(df, truth, estimate) # default delta = 1 huber_loss(df, truth = obs, estimate = pred) # named arguments huber_loss(df, solubility, prediction, delta = 0.5) # tighter quadratic zone df |> group_by(fold) |> huber_loss(obs, pred) # by resample huber_loss(df, obs, pred, na_rm = TRUE) # drop missing rows huber_loss_vec(truth_vec, pred_vec, delta = 1) # vector interface huber_loss(df, obs, pred, case_weights = w) # weighted Huber
Need explanation? Read on for examples and pitfalls.
What huber_loss() measures
huber_loss() blends squared and absolute error using a single threshold delta. You pass a data frame with the observed numeric outcome, the predicted values, and an optional delta (default 1). For each row the function applies 0.5 * residual^2 when the absolute residual is below delta, and delta * (abs(residual) , 0.5 * delta) above it. The mean of those per-row losses lands in a one-row tibble with .metric, .estimator, and .estimate.
The two pieces meet smoothly at the threshold, so the loss has a continuous first derivative, which is what gradient-based learners want. The linear tail means a single 50-unit outlier no longer dominates the score the way it does for RMSE. Tune delta to set how many outcome units count as "normal noise" before the linear penalty kicks in.
mae() or rmse(), which report errors on the outcome scale, huber_loss returns a quantity whose interpretation depends on delta. Compare huber_loss values only across models scored with the same delta on the same target.huber_loss() syntax and arguments
The signature follows the yardstick numeric-metric convention with one extra parameter. Once you understand delta, the rest matches rmse(), mae(), and the related regression scorers.
| Argument | Description |
|---|---|
data |
A data frame with the truth and estimate columns. |
truth |
Unquoted column name of the observed numeric outcome. |
estimate |
Unquoted column name of the predicted numeric values. |
delta |
Positive threshold where the loss switches from quadratic to linear. Default 1. |
na_rm |
If TRUE, drop rows with missing truth or estimate before scoring. |
case_weights |
Optional column of non-negative row weights for survey or importance-weighted data. |
Truth and estimate must both be numeric. Factor, character, or logical inputs error out. Use accuracy() or roc_auc() for classifiers.
Score regression models: four examples
The examples below use yardstick's built-in solubility_test data. First, load the package and inspect a few rows of the prediction set.
Example 1 calls huber_loss() with positional arguments and the default delta. Truth and estimate are picked up by position, and the tidy summary lands in three columns.
The .estimator column reports standard because Huber has no multiclass variant. The estimate of 0.221 is in log-solubility-loss units at delta = 1.
Example 2 varies delta to see how the loss changes shape. Small delta acts like MAE; large delta acts like half the MSE.
As delta grows, the quadratic zone widens and the score rises toward 0.5 * mean(residual^2). As delta shrinks, fewer rows sit in the quadratic zone and the score approaches delta * mean(abs(residual)).
Example 3 groups scoring by resample fold. With cross-validation predictions in one tibble, group_by() plus huber_loss returns one score per fold.
Example 4 uses the vector interface for scalar output. Inside map() calls or unit tests, huber_loss_vec() skips the tibble wrapping.
Use the vector form for bare numbers; stay with the data-frame form to bind, group, or plot scores alongside the rest of the yardstick family.
metrics(df, truth, estimate) does not include huber_loss by default, but metric_set(rmse, mae, huber_loss) builds a custom scorer that returns all three in a tidy stack, ready for ggplot() or gt().huber_loss() compared with related losses
Pick a partner from the table when huber_loss alone is not enough.
| Metric | Best use case | Limitation |
|---|---|---|
huber_loss() |
Outlier-tolerant training or scoring with a single tunable threshold | Score scale depends on delta, not directly the outcome unit |
rmse() |
Headline error in outcome units, large misses matter | Heavily penalises outliers |
mae() |
Outlier-robust error in outcome units | Treats a tiny miss and a small miss the same way |
huber_loss_pseudo() |
Twice-differentiable smooth approximation for gradient methods | Slightly different value than huber_loss for the same data |
mape() |
Percentage error reporting | Explodes when truth is near zero |
rsq() |
Unit-free goodness-of-fit | Can mask large systematic bias |
A common pattern is to train on huber_loss for robustness, then report rmse and mae for interpretation.
Choosing delta
Delta is the residual size at which the loss flips from quadratic to linear. Two rules cover most cases. The quantile rule sets delta to the 0.9 quantile of held-out absolute residuals, so roughly 90 percent of points stay quadratic. The noise-scale rule sets delta = 1.345 * sigma, matching Huber's M-estimator efficiency under a Gaussian contamination model. Either way, score and train with the same delta; mixing values gives nonsense numbers.
Common pitfalls
Three small mistakes account for most huber_loss() failures. Each has a one-line fix.
The first is comparing huber_loss values across different deltas. The loss scale shifts with delta, so a tighter delta produces a smaller number even when the model is unchanged. Lock delta before any comparison.
The second is treating huber_loss as an error in outcome units. RMSE of 0.7 on log-solubility means typical residuals near 0.7; a huber_loss of 0.22 is a delta-dependent blend of squared and linear pieces. Always cite delta next to the score.
The third is using a tiny positive delta like 1e-12. The loss collapses to almost zero and hides genuine error. Pick a delta near the typical residual magnitude.
Try it yourself
Try it: Use the built-in solubility_test data. Compute Huber loss at three deltas: 0.5, 1, and 2. Stack the three results into a tibble with columns delta and .estimate. Save the stacked table to ex_huber_delta.
Click to reveal solution
Explanation: Each call to huber_loss() returns a one-row tibble, and bind_rows() stacks the three calls into a single table. The estimate rises monotonically with delta because more residuals fall into the quadratic zone, where the squared term carries a higher cost than the linear tail.
Related yardstick metrics
huber_loss() sits among yardstick's regression scorers. Reach for these when huber_loss alone is not enough:
huber_loss_pseudo()for a smooth, twice-differentiable cousinrmse()for headline squared error in outcome unitsmae()for the fully outlier-robust absolute errormape()for percentage error reportingrsq()for unit-free goodness-of-fitmetric_set()to bundle huber_loss with rmse and mae
For the full list, see the yardstick reference index.
FAQ
What is a good huber_loss value?
There is no universal threshold. The score depends on both delta and the spread of your target, so a "good" value is one that beats a baseline model scored with the same delta. Always report huber_loss next to rmse and mae, and pin delta in the report.
How is huber_loss() different from rmse() and mae()?
RMSE squares every residual, so one big miss can dominate the score. MAE takes absolute values, so a 100-unit residual contributes 100, not 10,000. Huber sits between the two: residuals below delta are squared, residuals above delta are linear, combining RMSE-like gradients near zero with MAE-like outlier resistance in the tail.
What is the difference between huber_loss() and huber_loss_pseudo()?
huber_loss() is the piecewise function with a hard switch at delta. huber_loss_pseudo() uses delta^2 * (sqrt(1 + (r / delta)^2) , 1), a smooth approximation that is twice differentiable everywhere. Pseudo-Huber is safer for gradient-based optimisers; the original Huber is fine for scoring.
Can I weight rows when computing huber_loss?
Yes. Pass a case_weights column to huber_loss() with non-negative row weights. Common uses are survey weights and importance weights from parsnip::fit(). The function returns the weighted mean of per-row Huber losses on the same delta-dependent scale.
Summary
huber_loss() is the robust middle ground between rmse() and mae(). Use it when you want squared-error gradients for typical points and absolute-error tolerance for outliers. Tune delta from the residual scale, and report it next to rmse and mae so readers can interpret the value in outcome units. Pair with group_by() for per-fold scoring and metric_set() to bundle it into a tidymodels workflow.