caret spls() in R: Sparse Partial Least Squares
The caret spls method fits a sparse partial least squares regression model via train(method = "spls"). It wraps the spls package, selects predictors by shrinking small loadings to zero on each latent component, and tunes three dials (K, eta, kappa) by resampling, which makes it ideal for wide regression problems where predictors are correlated or outnumber rows.
train(x, y, method = "spls") # basic fit, defaults tuned train(x, y, method = "spls", tuneLength = 5) # 5x5x1 default grid train(x, y, method = "spls", tuneGrid = grid) # custom K, eta, kappa train(x, y, method = "spls", preProcess = c("center","scale")) # mandatory scaling predict(fit, newdata = x_new) # numeric predictions varImp(fit) # variable importance, zeros excluded fit$finalModel$betahat # sparse coefficient matrix
Need explanation? Read on for examples and pitfalls.
What caret spls does in one sentence
The spls method fits a regression that combines PLS dimension reduction with L1-style variable selection. You pass a numeric predictor matrix and a numeric response, caret hands the call to spls::spls(), and the fit returns latent components built only from the predictors that survived the sparsity threshold.
Ordinary PLS keeps every predictor in every component, which makes coefficients hard to interpret. Sparse PLS zeros small loadings via an L1 penalty controlled by eta, so each component selects its own subset. The result is a low-rank, interpretable regression that handles p > n and collinearity without overfitting.
caret spls syntax and tuning grid
The caret call shape is the standard train() interface with method = "spls" plus a three-column tuning grid. Everything else has reasonable defaults.
The tuning grid has three columns:
K: number of latent components, integer. Cap atmin(nrow(x) - 1, ncol(x)). Typical search:1:5.eta: sparsity parameter in[0, 1).eta = 0keeps all predictors (ordinary PLS);eta = 0.9zeros out almost everything. Typical search:seq(0.1, 0.9, by = 0.1).kappa: secondary shrinkage in[0, 0.5]. Almost always left at the default0.5; matters only for multi-response models.
train() adds the standard arguments: trControl for the resampling scheme, preProcess for scaling, and tuneLength to skip writing the grid by hand.
preProcess = c("center", "scale") is effectively mandatory for spls. Sparse PLS picks the predictors with the largest covariance to the response; an unscaled variable on a wider numeric range will dominate the selection regardless of its true signal. Always scale unless every column is already on the same unit.caret spls examples by use case
1. Fit a basic spls model on mtcars
The shortest useful call lets caret search a default tuneLength = 3 grid via bootstrap resampling. With ten predictors and 32 rows, mtcars is a textbook small-n regression problem.
caret reports RMSE, R-squared, and MAE for every grid row and stores the winning combo in fit$bestTune. Two components and eta = 0.5 are the sweet spot: enough latent structure to capture the weight and displacement signal, enough sparsity to drop the noisier columns.
2. Predict on new rows
predict.train() returns numeric predictions on the response scale. There is no type = "prob" branch because spls is a regression method.
Two of three predictions land near the true mpg; row 15 (Cadillac Fleetwood) misses by about six because of a heavy-tailed outlier on a small sample. Use RMSE(predict(fit, x), y) for an in-sample sanity check.
3. Custom tuning grid
For a serious search, define your own grid so you can probe the K-eta surface at finer resolution.
Bigger grids matter most when the response is hard to predict. On mtcars the top three rows are nearly tied because the signal already concentrates in a few obvious predictors.
4. Inspect selected predictors
The value of spls over plain PLS is that you can see which predictors each component kept. The selectvar slot of the final model lists the indices of non-zero loadings.
Four of ten predictors survived: displacement, horsepower, rear-axle ratio, and weight. The other six got zeroed because their covariance with mpg, after accounting for the survivors, fell below the eta threshold. This is the interpretability win.
5. Compare spls with non-sparse pls
When sparsity does not help, ordinary PLS matches or beats spls. A side-by-side fit settles the question.
spls edges out pls by about 0.06 RMSE. The win is small because mtcars has only ten predictors and most carry signal; sparsity pays off most when hundreds or thousands of predictors are mostly noise.
caret spls vs pls vs splsda
spls is sparse PLS regression; splsda is sparse PLS classification; pls is non-sparse PLS regression. Same engine family, different jobs.
| Method | Outcome type | Feature selection | When to pick it |
|---|---|---|---|
method = "spls" |
numeric | Yes, via eta | Wide regression, want interpretable variable subset |
method = "pls" |
numeric | No | Many predictors all carry signal; you only want dimension reduction |
method = "splsda" |
factor | Yes, via eta | Sparse PLS for classification (e.g. gene-expression class prediction) |
method = "glmnet" |
numeric or factor | Yes, via lambda | L1 or elastic-net penalty without latent components |
The natural alternative to spls is glmnet with alpha = 1 (lasso): both select variables, but spls also produces orthogonal latent components, which is useful when predictors are highly collinear. See the caret available models reference for the full method catalog and tuning-parameter columns.
Common pitfalls
Pitfall 1: forgetting to scale. spls picks variables by raw covariance with the response, so an unscaled column on a wider range dominates the selection. Always pass preProcess = c("center", "scale") unless every column is already on the same unit.
Pitfall 2: passing a data frame with factors. train(method = "spls") calls as.matrix(); factor columns get coerced silently. Expand factors first with caret::dummyVars() or model.matrix(~ . - 1, data = df).
Pitfall 3: searching kappa. kappa matters only for multivariate outcomes. For a single numeric y it is inert, so leave it at 0.5. Wasting grid rows on a kappa sweep just slows the search.
Pitfall 4: confusing K with ncomp. Other caret methods use ncomp; spls uses K. Mixing them in a tuneGrid silently falls back to defaults. Run modelLookup("spls") if you forget the argument names.
spls package is not loaded automatically by library(caret). caret only attaches it lazily when train(method = "spls") is called. Install it with install.packages("spls") before the first run, otherwise train() aborts with a there is no package called 'spls' error after spending several seconds on resampling setup.Try it yourself
Try it: Fit a caret spls model on mtcars predicting mpg with a custom grid of K = 1:3 and eta = c(0.3, 0.6). Save the best tuning combination to ex_best and the names of selected predictors to ex_selected.
Click to reveal solution
Explanation: The grid sweeps three component counts and two sparsity levels at the default kappa. bestTune reports the winning row; selectvar indexes the non-zero predictors in the final model so you can map back to column names.
Related caret functions
These pair naturally with a sparse PLS workflow:
train()withmethod = "pls": non-sparse counterpart for dimension-reduction without feature selectiontrain()withmethod = "splsda": sparse PLS for factor outcomespreProcess(): center, scale, or impute the predictor matrix before fittingvarImp(): relative variable importance computed from the loading magnitudesresamples(): compare spls against pls or glmnet on a common resampling scheme
FAQ
What is the difference between spls and pls in caret?
pls keeps every predictor in every latent component; spls zeros out small loadings via an L1 penalty controlled by eta. That gives spls a feature-selection story plain pls lacks, at the cost of one extra tuning dial. Pick pls when every predictor is informative and you just need dimension reduction. Pick spls when most predictors are noise and you want the model to commit to a subset.
How do I choose K and eta for caret spls?
Cross-validate. A 5x5 grid of K = 1:5 and eta = seq(0.1, 0.9, by = 0.2) covers the meaningful range on most regression problems; bump K higher only if your predictor count exceeds 20 and there is no plateau in resampled RMSE. Always leave kappa = 0.5 unless you have a true multi-response outcome.
Does caret spls work for classification?
No. method = "spls" is regression only. For a factor outcome use method = "splsda", which applies the same sparse PLS engine to dummy-coded classes and returns class predictions plus probabilities. The tuning grid is identical (K, eta, kappa) so any spls workflow can be ported to splsda by changing the method string and the response.