yardstick roc_curve() in R: ROC Sweep Data for Plotting
The yardstick roc_curve() function in R returns the sensitivity and specificity at every probability threshold a classifier produces, giving you a tidy tibble that plugs straight into autoplot(), ggplot2, group_by() for resamples, and multiclass one-vs-all decomposition.
roc_curve(df, truth, .pred_class1) # basic two-class call roc_curve(df, truth = obs, .pred_class1) # named truth column roc_curve(df, class, .pred_yes, event_level = "second") # flip positive class df |> group_by(fold) |> roc_curve(class, .pred_yes) # by resample roc_curve(df, class, .pred_a, .pred_b, .pred_c) # multiclass one-vs-all autoplot(roc_curve(df, class, .pred_yes)) # quick ggplot roc_curve(df, class, .pred_yes) |> ggplot(aes(1 - specificity, sensitivity)) + geom_path()
Need explanation? Read on for examples and pitfalls.
What roc_curve() returns
roc_curve() turns predicted probabilities into a threshold sweep, not a single score. You hand it a data frame with a truth column and one or more probability columns, and it returns a tibble of .threshold, specificity, and sensitivity rows, one per unique cutoff. Each row says "if you classify everything above this probability as positive, here is how the model performs."
That tibble is the raw material for the ROC plot. Pass it to autoplot() for a ready-made chart, or pipe it into ggplot2 for full styling control. Because the output is tidy, it composes with group_by() and dplyr verbs without extra glue code.
roc_curve() syntax and arguments
The signature mirrors the rest of the yardstick probability family. Argument shape changes between binary and multiclass: binary takes one probability column, multiclass takes one column per class.
| Argument | Description |
|---|---|
data |
Data frame with the truth column and probability columns. |
truth |
Unquoted column name of the observed class labels (must be a factor). |
... |
Unquoted probability columns. One column for binary, one per class for multiclass. |
na_rm |
If TRUE, drop rows where any column is missing before computing. |
event_level |
"first" or "second"; for binary, names which factor level is the positive class. |
case_weights |
Optional unquoted column for weighted curve points. |
The truth factor levels must match the probability column names after the .pred_ prefix. For binary problems, the third argument is the probability for the positive class, controlled by event_level.
Plot the curve: four worked examples
These examples build a small two-class tibble so the curve is reproducible. Start with a synthetic churn-prediction frame.
Call roc_curve() to get the threshold sweep. The result is a tibble you can read row by row.
Pipe the result into autoplot() for a styled ggplot with the diagonal reference line drawn.
autoplot(roc_pts) + labs(title = "Churn classifier") lets you brand the chart without rebuilding it from scratch.For full control, plot the raw points with ggplot. 1 - specificity is the false-positive rate; sensitivity is the true-positive rate.
Group by a resample column to plot one curve per fold on a single chart.
roc_curve() vs neighbouring ROC tools
Pick the function that matches what you want to see. The table below contrasts roc_curve() with the closest yardstick siblings.
| Function | Returns | When to use |
|---|---|---|
roc_curve() |
Tibble of thresholds with sensitivity and specificity | You want to plot or pick a cutoff |
roc_auc() |
One row with the area under the curve | You want a single ranking score |
pr_curve() |
Tibble of thresholds with precision and recall | Imbalanced data where precision matters |
gain_curve() |
Tibble of percent-found vs percent-tested | Marketing and lift charts |
conf_mat() |
A confusion matrix at one cutoff | You have already picked a threshold |
If you need both the chart and the headline number, compute roc_curve() for plotting and roc_auc() for the metric. They share the same probability inputs, so call them side by side without rebuilding the data.
Common pitfalls
Three mistakes show up in nearly every roc_curve() bug report.
First, passing event_level = "first" when the positive class is actually the second factor level inverts the curve below the diagonal. Always check levels(df$truth) and confirm which level the probability column refers to.
Second, feeding hard class predictions instead of probabilities throws Error: must be a numeric vector, not a factor. roc_curve() needs the raw probability columns named .pred_<class>, not the .pred_class argmax column.
Third, multiclass calls require one probability column per level, in any order. Passing only the positive-class column with three or more factor levels triggers Error: A multiclass problem requires probability columns for all levels. Use tidyselect (.pred_a:.pred_c or starts_with(".pred_")) when classes share a prefix.
Try it yourself
Try it: Build the ROC curve for the churn tibble above and find the threshold whose sensitivity is closest to 0.8. Save that threshold to ex_thr.
Click to reveal solution
Explanation: roc_curve() returns every threshold the classifier produced, so picking a cutoff is a dplyr filter on the curve tibble. slice_min(abs(sensitivity - 0.8)) keeps the row whose sensitivity is closest to the target.
Related yardstick functions
sklearn.metrics.roc_curve(y_true, y_score) returns three arrays; yardstick returns one tidy tibble with named columns instead.roc_auc()for the single-number ranking score the curve summarises.pr_curve()andpr_auc()for the precision-recall variant on imbalanced data.gain_curve()andlift_curve()for marketing-style cumulative charts.conf_mat()andaccuracy()once you have picked a threshold from the curve.metric_set()to bundle roc_auc() with calibration metrics likebrier_class().
See the yardstick reference for roc_curve() on tidymodels.org for the full argument list.
FAQ
What is the difference between roc_curve() and roc_auc() in yardstick?
roc_curve() returns a tibble of thresholds, sensitivity, and specificity, which is the data you plot. roc_auc() reduces that curve to one number, the area under it. Both consume the same inputs (a truth factor and a probability column), so you usually call them side by side: one for the chart, one for the leaderboard metric.
How do I plot a ROC curve from yardstick output?
Call autoplot() on the roc_curve() result for a ready-made ggplot with the diagonal reference line. For custom styling, pass the tibble to ggplot2 and map 1 - specificity to x and sensitivity to y, then layer geom_path(). The tidy output composes with facet_wrap() and group_by() without extra reshaping.
Does roc_curve() handle multiclass problems?
Yes. Pass one probability column per class via tidyselect (for example .pred_setosa:.pred_virginica). yardstick computes a one-vs-all curve per level and returns a tibble with an extra .level column. Pipe it into autoplot() to get one panel per class.
Why is my ROC curve below the diagonal?
The most common cause is event_level. yardstick defaults to "first", meaning the first factor level is the positive class. If your probability column predicts the second level, set event_level = "second". Otherwise the curve flips. Check levels(df$truth) against the .pred_* column name to verify.
Can roc_curve() use case weights?
Yes. Pass an unquoted column to case_weights and yardstick weighs each row when accumulating sensitivity and specificity. This is useful for survey data or stratified sampling where rows represent different population shares.