yardstick smape() in R: Symmetric Percentage Error Metric

The yardstick smape() function in R returns the symmetric mean absolute percentage error of a regression model, dividing each absolute residual by the average of the truth and estimate, so the result is bounded near 0 to 200 percent and stays stable when the outcome sits near zero.

By Selva Prabhakaran · Published May 23, 2026 · Last updated May 23, 2026

⚡ Quick Answer

smape(df, truth, estimate)                         # basic call
smape(df, truth = obs, estimate = pred)            # named arguments
smape(df, sales, forecast)                         # forecast columns
df |> group_by(series) |> smape(obs, pred)         # by group or series
smape(df, obs, pred, na_rm = TRUE)                 # drop missing rows
smape_vec(truth_vec, pred_vec)                     # vector interface
smape(df, obs, pred, case_weights = w)             # weighted SMAPE

Need explanation? Read on for examples and pitfalls.

📊 Is smape() the right tool?

What smape() measures

smape() averages the absolute residual scaled by the half-sum of truth and estimate. You pass a data frame with the observed numeric outcome and the predicted values, and the function returns a one-row tibble with .metric, .estimator, and .estimate. The estimate reads as a percent, and the value is bounded near 0 to 200 percent.

The denominator is (|truth| + |estimate|) / 2, not truth alone. That swap makes SMAPE symmetric, and a model predicting zero against a small positive truth no longer produces an unbounded percentage. The trade is that the result is no longer "percent of truth", so the business reading drifts away from MAPE's.

Key Insight

SMAPE replaces division by truth with division by the average magnitude. That single change bounds the metric, makes it symmetric to over and under-prediction, and stops near-zero truths from blowing up the headline. The trade is a less intuitive scale: 40 percent SMAPE does not mean predictions are off by 40 percent of the truth.

smape() syntax and arguments

The signature matches the rest of yardstick's numeric metrics. Once you know the shape, the same call works for mae(), rmse(), mape(), and the rest of the regression family.

Run live

Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.

Rsmape generic signature

smape(data, truth, estimate, na_rm = TRUE, case_weights = NULL, ...)

Argument	Description
`data`	A data frame holding the truth and estimate columns.
`truth`	Unquoted column name of the observed numeric outcome.
`estimate`	Unquoted column name of the predicted numeric values.
`na_rm`	If `TRUE`, drop rows where either column is missing before scoring.
`case_weights`	Optional column of row weights for survey or importance-weighted data.

Both columns must be numeric. Output is scaled as a percent (22.4 means 22.4 percent) and can exceed 100 when one side of the pair is much larger than the other.

SMAPE in action: four worked examples

The examples fit a simple lm on mtcars and score in-sample predictions. Build the prediction frame first.

RLoad yardstick and build prediction frame

library(yardstick) library(dplyr) set.seed(42) fit <- lm(mpg ~ wt + cyl, data = mtcars) preds <- tibble( actual = mtcars$mpg, pred = predict(fit, mtcars) ) head(preds, 4) #> # A tibble: 4 x 2 #> actual pred #> <dbl> <dbl> #> 1 21 22.3 #> 2 21 21.5 #> 3 22.8 26.3 #> 4 21.4 20.4

Example 1 calls smape() with positional arguments. The function locates truth and estimate by position and returns the tidy summary.

RBasic SMAPE on mtcars predictions

smape(preds, actual, pred) #> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 smape standard 11.7

The .estimator is standard because SMAPE has no binary or multiclass variant. The 11.7 reads as 11.7 percent and sits close to MAPE on this dataset because truth values stay well away from zero, the regime where the two metrics agree.

Example 2 shows what happens when truth approaches zero. Adding a row where truth is 0.01 leaves SMAPE stable, where MAPE would spike.

RSMAPE stays stable when truth is near zero

risky <- bind_rows( preds, tibble(actual = 0.01, pred = 0.50) ) smape(risky, actual, pred) #> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 smape standard 17.4

The headline moved from 11.7 to 17.4, not to 160 the way MAPE did on the same row. The bounded denominator absorbed the small truth instead of dividing by it.

Example 3 groups scoring across folds or product lines. When cross-validation predictions or per-segment forecasts live in one tibble, group_by() plus smape returns one percentage per group.

RPer-fold SMAPE from a single prediction tibble

folded <- preds |> mutate(fold = rep(paste0("fold", 1:4), length.out = n())) folded |> group_by(fold) |> smape(truth = actual, estimate = pred) #> # A tibble: 4 x 4 #> fold .metric .estimator .estimate #> <chr> <chr> <chr> <dbl> #> 1 fold1 smape standard 11.0 #> 2 fold2 smape standard 12.5 #> 3 fold3 smape standard 11.2 #> 4 fold4 smape standard 12.1

Example 4 uses the vector interface for quick checks. Inside map() calls or unit tests, smape_vec() returns a plain scalar instead of a one-row tibble.

RVector interface returns a numeric scalar

smape_vec(preds$actual, preds$pred) #> [1] 11.7204

Use the vector form for scalar thresholds; otherwise stay with the data-frame form to bind, group, or plot.

Tip

Report SMAPE alongside MAE when truth crosses zero. SMAPE handles the small denominator gracefully, but MAE in outcome units anchors the magnitude. The pair gives reviewers a bounded percent and a unit-correct error in one glance.

When to pick smape() over its neighbors

SMAPE is the metric for percent error when truths sit near zero. The table picks the right neighbor otherwise.

Metric	Best use case	Limitation
`smape()`	Bounded percent error stable near zero	Less intuitive than MAPE; rewards equal-magnitude misses oddly
`mape()`	Plain percent-of-truth headline, truth far from zero	Explodes near zero; asymmetric to under and over-prediction
`mae()`	Outlier-robust error in outcome units	Not comparable across targets with different scales
`rmse()`	Punishes large misses harder than small ones	Sensitive to outliers; not scale-free
`mase()`	Scale-free comparison across many time series	Requires a naive baseline forecast
`rsq()`	Unit-free 0-to-1 goodness-of-fit	Can mask large systematic bias

A common pairing is SMAPE plus MAE: SMAPE for a bounded percent that survives small truths, MAE for raw error in the outcome's own units.

Common pitfalls

Three SMAPE mistakes show up repeatedly in forecasting reports. Each has a one-line fix.

The first is reading SMAPE as "percent of truth". It is not. A SMAPE of 40 percent does not mean predictions are 40 percent off the actual value, because the denominator is the average of truth and estimate. Report SMAPE next to MAE so readers anchor on unit-correct error.

The second is treating SMAPE as fully symmetric. Swapping truth and estimate gives the same score, but constant under- and over-prediction of equal size still produce different scores when scale differs. Pair SMAPE with mean(estimate - truth) to expose direction bias.

The third is comparing SMAPE across tools. Some textbooks divide by (|truth| + |estimate|) instead of half that sum, which halves the headline. yardstick uses the bounded-near-200 percent convention; confirm the formula before copying values across tools.

RAudit direction bias when reporting SMAPE

bias <- mean(preds$pred - preds$actual) smape_val <- smape_vec(preds$actual, preds$pred) c(bias = bias, smape = smape_val) #> bias smape #> 0.0000 11.7204

Warning

SMAPE can drift toward 200 percent when truth and estimate sign-flip. If the outcome can be negative and the model predicts a positive (or vice versa), |truth| + |estimate| shrinks the denominator relative to the residual and the metric inflates. Inspect sign agreement before trusting a high SMAPE reading.

Try it yourself

Try it: Use the mtcars lm fit from above. Build a small forecast tibble with one row where actual = 0.5 and pred = 0.6, append it to preds, and compute both SMAPE and MAPE. Save the comparison to ex_smape_vs_mape.

RYour turn: compare SMAPE and MAPE near zero

library(yardstick) library(dplyr) # Try it: compare SMAPE and MAPE after appending a small-truth row ex_smape_vs_mape <- # your code here ex_smape_vs_mape #> Expected: 2 rows, one per metric

Click to reveal solution

RSolution

risky <- bind_rows( preds, tibble(actual = 0.5, pred = 0.6) ) ex_smape_vs_mape <- bind_rows( smape(risky, actual, pred), mape(risky, actual, pred) ) ex_smape_vs_mape #> # A tibble: 2 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 smape standard 12.0 #> 2 mape standard 12.4

Explanation: One small-truth row barely moves SMAPE (11.7 to 12.0) while MAPE rises faster because it divides by the small truth directly.

smape() sits inside yardstick's regression family. Reach for these neighbors when SMAPE is not the right fit:

mape() for percent-of-truth when truths stay well away from zero
mae() for outlier-robust error in the outcome's units
rmse() to punish large misses harder than small ones
mase() for scale-free comparison across many series
rsq() for a unit-free 0-to-1 goodness-of-fit score
metrics() to compute several scores in one call

For the full set, see the yardstick reference index.

FAQ

What is a good SMAPE value?

Bands are looser than for MAPE because SMAPE is bounded near 200 percent. Roughly: below 10 percent is excellent, 10 to 30 percent is good for noisy targets, 30 to 80 percent is workable, and above 100 percent usually signals sign-flips or scale misses. Anchor SMAPE with MAE for unit-correct context.

How is smape() different from mape()?

MAPE divides each absolute residual by truth, so a small truth blows up the headline and over-prediction stays unbounded. SMAPE divides by the average of |truth| and |estimate|, which bounds the metric and keeps it stable when truth approaches zero. Use SMAPE when truths cross or sit near zero; use MAPE for a cleaner percent-of-truth reading when truths stay well away from zero.

Does yardstick scale smape() to 0 to 100 or 0 to 200?

yardstick divides by the half-sum of |truth| and |estimate|, putting the upper bound near 200 percent. Some textbooks divide by the full sum and report half that value, so confirm the formula before comparing across tools.

Can smape() be used with negative truth values?

It runs, and the absolute values in the denominator avoid dividing by a signed near-zero quantity. But when truth and estimate sit on opposite sides of zero, SMAPE inflates quickly because the residual grows while the denominator stays small. For sign-flipping series, prefer mae() for error in raw units.

Summary

smape() is the bounded, symmetric percentage scorecard in yardstick's regression family. Reach for it when truths sit near or cross zero, pair it with mae() to anchor magnitude, and remember the reading is not "percent of truth" but a stability-friendly cousin. With group_by() it gives per-segment percentages; with metrics() it slots into a multi-metric report.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

yardstick smape() in R: Symmetric Percentage Error Metric

What smape() measures

smape() syntax and arguments

SMAPE in action: four worked examples

When to pick smape() over its neighbors

Common pitfalls

Try it yourself

FAQ

Summary

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

yardstick smape() in R: Symmetric Percentage Error Metric

What smape() measures

smape() syntax and arguments

SMAPE in action: four worked examples

When to pick smape() over its neighbors

Common pitfalls

Try it yourself

Related yardstick metrics

FAQ

Summary