parsnip set_mode() in R: Set Regression or Classification
The parsnip set_mode() function declares whether an R model predicts a number or a category. You pass a model specification and the string "regression", "classification", or "censored regression", and parsnip wires up the correct engine and prediction behavior.
linear_reg() |> set_mode("regression") # numeric outcome
logistic_reg() |> set_mode("classification") # factor outcome
rand_forest() |> set_mode("classification") # forest classifier
decision_tree() |> set_mode("regression") # tree regressor
svm_rbf() |> set_mode("classification") # SVM classifier
boost_tree() |> set_mode("regression") # boosted regressor
proportional_hazards() |> set_mode("censored regression") # survivalNeed explanation? Read on for examples and pitfalls.
What set_mode() does
set_mode() declares the prediction task. A parsnip model specification stays task-agnostic until you tell it what kind of outcome it predicts. set_mode() supplies that with one of three strings: "regression" for a numeric outcome, "classification" for a factor outcome, and "censored regression" for survival times.
parsnip needs the mode because one model function can solve different problems. rand_forest() builds a regression forest or a classification forest depending on the mode. Without it, parsnip cannot choose the engine's prediction routine or validate your outcome column.
The printed (regression) confirms the mode is attached. The specification is still just a recipe for a model, no data has touched it yet.
set_mode() syntax and arguments
set_mode() takes a specification and a mode string. The function signature is short:
set_mode(object, mode, ...)
| Argument | Description |
|---|---|
object |
A parsnip model specification, such as rand_forest() or logistic_reg(). |
mode |
One of "regression", "classification", or "censored regression". |
... |
Reserved for future use; leave it empty. |
set_mode() returns an updated specification, so it chains cleanly with the native pipe. Models that support only one task carry a default mode, but stating it explicitly keeps a script readable.
logistic_reg() is classification-only, so this call is documentation rather than a requirement. For a dual-mode model the call is mandatory.
Setting the mode: four examples
Most models need an explicit mode. These examples cover the common cases you meet when building a tidymodels pipeline.
One model, two modes
One model function covers both tasks. rand_forest() becomes a regressor or a classifier purely through set_mode(), with every other argument unchanged.
Mode before or after the engine
Order does not matter. parsnip stores the mode and the engine in independent slots, so both pipelines below produce an identical specification.
A complete fit
Once the mode is set, fit() can train the model. Here a linear model predicts mpg from the mtcars dataset.
Censored regression for survival models
Survival models use a third mode. proportional_hazards() and survival_reg() take the "censored regression" mode. The string differs, but the set_mode() call is identical in shape to the examples above.
show_engines("rand_forest") to list every engine and mode pair a model offers. The mode column tells you exactly which strings set_mode() will accept.When set_mode() is required vs optional
Single-task models default the mode; flexible models do not. Regression-only and classification-only models ship a built-in default, so set_mode() is optional for them. Flexible models leave the mode as "unknown" until you set it, and calling fit() on an unknown-mode specification raises an error before any computation starts.
| Model function | Modes supported | set_mode() needed? |
|---|---|---|
linear_reg() |
regression | Optional (default) |
logistic_reg() |
classification | Optional (default) |
rand_forest() |
regression, classification, censored regression | Required |
decision_tree() |
regression, classification, censored regression | Required |
boost_tree() |
regression, classification | Required |
RandomForestRegressor or RandomForestClassifier. parsnip instead uses one model function and switches behavior through set_mode().Common pitfalls
Three mistakes account for most set_mode() errors. Each one fails loudly, which makes it quick to diagnose once you know the pattern.
- Misspelling the mode string. parsnip accepts only three exact strings, so a typo such as
"classifcation"triggers an immediate error. - Choosing a mode the model cannot do.
linear_reg()has no classification mode, so requesting one fails. - Skipping set_mode() on a flexible model. Fitting a
rand_forest()spec with an unknown mode stops before training begins.
Try it yourself
Try it: Build a classification decision tree specification for the iris dataset, set its mode, and confirm the mode prints correctly. Save the spec to ex_spec.
Click to reveal solution
Explanation: decision_tree() supports both modes, so set_mode("classification") is required. The printed (classification) confirms parsnip will build a classifier when you fit it.
Related parsnip functions
set_mode() is one step in building a model specification. These functions complete the workflow:
set_engine()selects the computational backend, such asrangerorglm.set_args()sets or updates model hyperparameters after the spec exists.fit()trains the specification on a data frame.translate()shows the exact engine call parsnip will run.show_engines()lists the engine and mode pairs a model supports.
See the official parsnip documentation for the full reference.
FAQ
What is the default mode in parsnip? There is no single default. Single-task models carry their own: linear_reg() defaults to regression and logistic_reg() defaults to classification. Flexible models such as rand_forest() and boost_tree() start with an "unknown" mode and have no default at all, so set_mode() is mandatory for them before fitting. Printing any specification shows its current mode in parentheses, so you can always check what is set.
Do I need set_mode() for linear_reg() and logistic_reg()? Not strictly. Both are single-task models with a built-in default mode, so they fit without an explicit set_mode() call. Many practitioners still add it for clarity, because a reader scanning the pipeline sees the prediction task immediately. It also protects the script if you later swap in a flexible model that does require the mode.
Can I change the mode after setting it? Yes. set_mode() overwrites whatever mode the specification currently holds, so calling it twice keeps the last value. Because a specification is just an object, reassigning it with a new set_mode() call costs nothing until you fit. Nothing is computed, so there is no penalty for changing your mind while building the pipeline.
Does the order of set_mode() and set_engine() matter? No. parsnip stores the mode and the engine as independent slots on the specification, then merges them when you fit. You can call set_mode() before or after set_engine() and get an identical object, as identical() confirms. Pick whichever order reads more naturally in your pipeline.