parsnip mlp() in R: Single-Layer Neural Network Spec
The parsnip mlp() function defines a single-layer neural network, a multilayer perceptron, for classification or regression in tidymodels. It gives you one interface to a feed-forward net with a hidden layer of units, fitted with the nnet engine underneath.
mlp() # default spec, nnet engine mlp() |> set_mode("classification") # classify a factor outcome mlp() |> set_mode("regression") # predict a numeric outcome mlp(hidden_units = 5) # 5 units in the hidden layer mlp(penalty = 0.01) # weight decay to curb overfitting mlp(epochs = 100) # number of training iterations fit(spec, Species ~ ., data = iris) # train the spec on data
Need explanation? Read on for examples and pitfalls.
What mlp() does
mlp() is a model specification, not a fitted model. It records your intent to build a multilayer perceptron and the hyperparameters you want, but no data touches it until you call fit(). That separation lets you reuse one specification across many datasets or resampling folds.
A multilayer perceptron is a feed-forward neural network. The parsnip mlp() function builds the single-hidden-layer version: inputs feed a layer of hidden units, and those units feed the output. The hidden layer is what lets the model learn non-linear patterns that a plain linear model cannot.
The function belongs to the tidymodels framework. Because parsnip standardizes the interface, the same mlp() code drops straight into a workflow() or a tune_grid() call without rewriting.
fit() turns it into a trained network. Keeping those two steps apart is what makes tidymodels workflows reproducible across resamples.nnet, a recommended R package that is usually already installed. Other engines such as brulee, keras, and h2o need their own packages before you can fit() with them.mlp() syntax and arguments
mlp() takes up to six hyperparameters plus two setup verbs. The arguments shape the network, while set_engine() and set_mode() finish the specification.
The hidden_units argument sets how many neurons live in the hidden layer, where more units add capacity but risk overfitting. The penalty argument applies weight decay, a regularization term that shrinks weights toward zero. The epochs argument caps how many passes the optimizer makes over the data.
Not every argument applies to every engine. The default nnet engine uses hidden_units, penalty, and epochs, but ignores dropout and learn_rate. Those belong to the brulee and keras engines. Setting an unused argument is harmless, parsnip just drops it.
Fit a neural network: four examples
Every example below uses a built-in R dataset. The iris data drives the classification examples and mtcars drives the regression example, so the code runs anywhere with no downloads.
Example 1: Classify iris with the nnet engine
Build the specification, then fit it to data. The nnet engine trains a small network to separate the three iris species.
The 4-5-3 network line confirms the shape: four input predictors, five hidden units, three output classes. The 43 weights are the parameters the optimizer learned during the 100 epochs.
Example 2: Predict classes and probabilities
predict() returns a tidy tibble with one row per input row. Use type = "prob" to get per-class probabilities instead of the hard label.
The probability columns are named .pred_<class> and each row sums to one. They come from the softmax output layer, useful for ranking predictions or applying a custom decision threshold.
Example 3: Fit a neural network regression on mtcars
Switch the mode to "regression" and the same function predicts a number. Nothing else about the call structure changes.
A regression network outputs a single numeric value per row in the .pred column. With raw mtcars predictors, keep hidden_units small so the tiny dataset does not overfit.
tune() and let tune_grid() score a range on resamples. A small grid of 1 to 10 hidden units is a sensible start.Example 4: Mark hyperparameters for tuning
Pass tune() instead of a value to defer a hyperparameter. The specification stays valid and prints the placeholders so you can confirm what will be searched.
This specification is not fitted yet. You hand it to tune_grid() with a resampling object such as vfold_cv(), and the framework fills in the best hidden_units and penalty from cross-validation.
Compare mlp() engines
The engine decides which backend trains the network. All engines share the mlp() interface, so you swap them with one set_engine() call and keep the rest of the specification.
| Engine | Backend | Use when |
|---|---|---|
nnet |
Base recommended package | You want a fast, dependency-free default |
brulee |
Torch via the brulee package | You need dropout, learn_rate, and GPU support |
keras |
TensorFlow through keras | You already work in the Keras ecosystem |
h2o |
The h2o cluster engine | You train large data on an h2o backend |
The decision rule is simple. Start with nnet for a quick baseline, move to brulee when you need modern training controls like dropout, and reach for keras or h2o only when an existing stack or data size demands it.
Common pitfalls
Three mistakes catch most newcomers to mlp(). Each one below shows the problem and the fix.
The most common is forgetting to set the mode. A neural network can classify or predict a number, so parsnip cannot guess which one you want and fit() fails until you call set_mode().
The second pitfall is leaving predictors on different scales. A neural network is sensitive to input magnitude, so a wide-range column dominates training. In a workflow(), add step_normalize() to a recipe so every predictor is centered and scaled. The third is asking the nnet engine for too many weights, since it caps the network size and errors with too many weights when hidden_units is large.
mlp() specification can disagree. Call set.seed() before fit() to make the network reproducible.Try it yourself
Try it: Fit a classification network on iris with 4 hidden units and penalty = 0.05, then predict the class for the 120th row. Save the prediction to ex_pred.
Click to reveal solution
Explanation: The hidden_units argument sizes the hidden layer at 4, and set_mode("classification") tells parsnip to predict the Species factor. Row 120 of iris is a virginica flower, so the trained network labels it virginica.
Related parsnip functions
mlp() works alongside the rest of the parsnip model family. These functions cover the neighboring tasks in a tidymodels project.
rand_forest()defines a random forest ensemble of decision trees.boost_tree()defines a gradient-boosted tree model.logistic_reg()defines a logistic regression classifier baseline.set_engine()chooses the computational backend for any specification.fit()trains a specification on data and returns a model object.
FAQ
What package is mlp() in?
mlp() ships in core parsnip, so library(tidymodels) or library(parsnip) makes it available. The function only describes the network, though, and the actual fitting happens in an engine package. The default engine is nnet, a recommended R package that is usually installed already, so most users can fit an mlp() specification without installing anything extra.
What is the default engine for mlp()?
The default engine is nnet, which fits a single-hidden-layer neural network in base R. Name it explicitly with set_engine("nnet") for clarity. To see every registered backend, run show_engines("mlp"), which lists nnet, brulee, keras, h2o, and a few others. Pick nnet for a fast baseline and switch only when you need a specific feature.
How many hidden layers does mlp() support?
The parsnip mlp() function builds a single hidden layer, which is what makes it a classic multilayer perceptron. For two hidden layers, parsnip provides mlp(engine = "brulee_two_layer"). Deeper architectures are outside the parsnip interface and call for keras or torch directly.
How do I tune the network size in mlp()?
Set hidden_units = tune() and penalty = tune() in the specification, then pass it to tune_grid() with a resampling object such as vfold_cv(). The framework scores each combination with cross-validation. Use select_best() to pick the winner, then finalize_workflow() to lock the values before the final fit.
Why do my mlp() predictions change every run?
A neural network starts from random weights, so each fit can land on a different solution. Call set.seed() immediately before fit() to make the result reproducible. Inside a tuning workflow, the resampling object also carries a seed, so set it once at the top of the script for consistent runs.
For the full argument reference, see the parsnip mlp() documentation.