caret bagEarth() in R: Bagged MARS Ensemble Models
The bagEarth() function in caret fits a bagged ensemble of MARS models by drawing many bootstrap samples, fitting earth::earth() on each, and averaging the predictions. It stabilizes a learner that is otherwise sensitive to the exact training rows, and exposes a familiar formula and matrix interface with full predict() support.
bagEarth(mpg ~ ., data = mtcars) # formula interface bagEarth(x = mtcars[, -1], y = mtcars$mpg) # x, y interface bagEarth(mpg ~ ., data = mtcars, B = 100) # 100 bootstrap samples bagEarth(mpg ~ ., data = mtcars, keepX = FALSE) # smaller model object predict(fit, newdata = mtcars[1:5, ]) # averaged prediction train(mpg ~ ., method = "bagEarth", data = mtcars) # CV-tuned via caret varImp(fit) # importance over bag
Need explanation? Read on for examples and pitfalls.
What bagEarth() does in one sentence
bagEarth() is caret's bootstrap aggregator for the earth (MARS) model. You hand it a formula or x and y, set B (the number of bootstrap samples, default 50), and caret resamples the training rows B times, fits an earth model on each resample, and stores every fitted model inside a single bagEarth object. Calling predict() later averages predictions across the bag, which absorbs the variance MARS picks up from sensitive knot placement and typically beats a single MARS fit on noisy data.
bagEarth() syntax and arguments
Two equivalent entry points cover formula and matrix workflows. caret mirrors the lm() and glm() API, then layers the bagging parameter on top.
Formula form:
bagEarth(formula, data, B = 50, summary = mean, keepX = TRUE, ...)
Matrix form:
bagEarth(x, y, weights = NULL, B = 50, summary = mean, keepX = TRUE, ...)
formula: likempg ~ ., numeric outcome for regression or factor for two-class classification.data: a data frame holding the columns named in the formula.x,y: matrix or data frame of predictors, plus a numeric or factor outcome vector.B: number of bootstrap samples drawn from the training rows. Default 50.summary: function used to combine predictions across theBmodels. Defaultmean.keepX: keep the bootstrap predictor matrices for inspection. DefaultTRUE; setFALSEfor smaller objects....: forwarded to the underlying earth fit (degree,nprune, etc.).
bagEarth() accepts every earth() tuning knob. Pass degree = 2 to allow two-way interactions, or nprune = 15 to cap the number of basis functions per bootstrap fit. The same hyperparameters apply uniformly to every model in the bag.bagEarth() examples by use case
1. Fit a bagged regression model on mtcars
The shortest call uses every column to predict mpg. The fitted object stores all 50 underlying earth models plus the bootstrap row indices.
Each model has its own selected basis functions, so different bootstraps may keep different predictors, and the averaged prediction inherits stability from this diversity. Saving the object with saveRDS() is enough to score new data later.
2. Score new data with predict()
predict() returns the ensemble average across all bootstrap models for regression.
Compare these to the actual mtcars$mpg[1:5] values (21.0, 21.0, 22.8, 21.4, 18.7) and the ensemble tracks the truth closely. To inspect individual bootstrap predictions, look at fit$fit and apply predict() to each member.
3. Increase B for a smoother ensemble
A larger bag reduces Monte Carlo noise in the averaged prediction. The marginal gain shrinks quickly: 100 to 200 is usually enough for tabular regression.
Training MSE drops because more bootstrap samples cover the input space more evenly. For honest evaluation always use a holdout split or train() with cross-validation, never the training set itself.
4. Hold out a test set and check RMSE
The mtcars sample is tiny, but the same pattern applies to any tabular regression task.
A test-set RMSE around 3 mpg on 32 rows is the noise floor. The real value of bagEarth shows on hundreds to thousands of rows where a single MARS fit becomes unstable; swap bagEarth for earth::earth on the same split to see the variance bagging absorbed.
5. Tune through caret train()
For grid search over degree and nprune, hand bagEarth to train() and let caret cross-validate.
train() refits the bagged ensemble once per hyperparameter combination, so total cost is nrow(grid) * folds * B earth fits. Cap B modestly during search, then refit with a larger B at the winning hyperparameters.
degree and nprune first; then bag.bagEarth() vs other ensembles
bagEarth() is the right pick when MARS is already a sensible base learner. Other ensembles target different biases.
| Function | Base learner | Resampling | When to use |
|---|---|---|---|
bagEarth() (caret) |
earth/MARS | bootstrap, B copies | smooth nonlinear regression with noisy training rows |
earth::earth() |
single MARS | none | quick interpretable fit, no ensemble cost |
ipred::bagging() |
rpart tree | bootstrap | piecewise constant fit, simple step relationships |
randomForest() |
rpart tree | bootstrap plus column subsampling | tabular default, handles interactions |
gbm::gbm() |
trees | sequential boosting | small base learners, careful tuning, top accuracy |
For the full caret method registry, see the caret reference.
Common pitfalls
Pitfall 1: forgetting the earth package. caret loads earth lazily on first use. Run install.packages("earth") once and keep library(earth) available before calling bagEarth() to avoid an opaque "could not find function" error.
Pitfall 2: B too low. With B = 10 predictions still wobble because the bootstrap average has not stabilized. Default 50; scale to 100-200 when time allows.
Pitfall 3: ignoring the size of the model object. Each bootstrap stores its earth fit and (by default) the bootstrap predictor matrix. For 100 bootstraps on a 50,000-row dataset, the saved object can run into gigabytes. Set keepX = FALSE for production deployments.
Pitfall 4: bagging a misconfigured earth model. Bagging cannot fix a wrong degree. Validate a single earth::earth() fit before wrapping it.
predict() does not accept a bare numeric vector. Pass a data frame or matrix with the same column structure as the training data. A single new observation must be wrapped as a one-row data frame, not a numeric vector, or the call will silently misalign predictors.Try it yourself
Try it: Fit a bagged MARS model on mtcars with B = 80, predict mpg for the first three rows, and compute the residuals. Save the predictions to ex_pred and residuals to ex_resid.
Click to reveal solution
Explanation: predict() averages over the 80 bootstrap earth fits. Subtracting the predictions from the true mpg values gives per-row residuals; small residuals confirm the ensemble fits the training rows well.
Related caret functions
These complete a typical bagEarth workflow:
train()withmethod = "bagEarth": cross-validated hyperparameter searchvarImp(): variable importance averaged across the bagged earth modelscreateDataPartition(): stratified train and test split before fittingpreProcess(): center, scale, or impute predictors prior to baggingbagFDA(): caret's bagged flexible discriminant analysis cousin
FAQ
What is bagEarth in caret used for?
bagEarth() fits a bagged ensemble of MARS models for regression. It resamples the training rows B times, fits an earth model on each bootstrap, and averages the predictions. The averaging stabilizes the high-variance MARS estimator, which is sensitive to knot selection. Use it when a single earth fit looks promising but predictions shift noticeably across reruns or holdout splits.
How many bootstrap samples should I use for bagEarth?
Start with the default B = 50, then scale to 100 or 200 if predictions are still noisy across reruns. Marginal gain decays after about 100 bootstraps for most tabular regression problems, while model size and prediction time scale linearly with B. Sweep B = 25, 50, 100, 200 once and pick the smallest value where validation error plateaus.
How is bagEarth different from random forest?
Both are bootstrap ensembles, but the base learner differs. bagEarth() bags MARS models, which fit smooth piecewise-linear functions with selected knots. randomForest() bags decision trees with random column subsampling at each split. Forests handle high-dimensional inputs and categorical splits naturally; bagEarth is better for smooth nonlinear regression on continuous predictors.
Can bagEarth do variable importance?
Yes. Call varImp(fit) on a fitted bagEarth object and caret averages the earth variable importance across all B bootstraps. The output is a varImp.bagEarth object with one row per predictor; print it for a ranking or pass it to plot(). Variables that appear in many bootstrap fits and contribute large reductions in GCV rank highest.
Why is my bagEarth model object so large?
Each bootstrap stores its fitted earth model and, by default, the bootstrap predictor matrix. For 100 bootstraps on tens of thousands of rows, that adds up fast. Set keepX = FALSE at fit time to drop predictor matrices, and use saveRDS(fit, compress = "xz") to compress the saved object. A smaller nprune also shrinks each earth fit.