caret createResample() in R: Bootstrap Sample Indexes
The createResample() function in caret draws bootstrap samples with replacement, returning a list of training-row indexes where each resample is the same size as the input vector. Roughly 63.2% of the original rows appear in each resample, the rest are out-of-bag, and factor outcomes are sampled within class to keep the bootstrap stratified.
createResample(y, times = 25) # 25 bootstrap resamples (list) createResample(y, times = 100) # 100 resamples for tighter CIs createResample(y, times = 25, list = FALSE) # matrix, one column per resample createResample(iris$Species, times = 10) # stratify by factor levels createResample(mtcars$mpg, times = 10) # numeric outcome, plain sampling length(unique(resamples$Resample01)) / length(y) # ~0.632 unique fraction setdiff(seq_along(y), resamples$Resample01) # out-of-bag rows for resample 1
Need explanation? Read on for examples and pitfalls.
What createResample() does in one sentence
createResample() is caret's bootstrap sampler. You give it an outcome vector and a count, and it returns a named list of integer vectors, each one a bootstrap sample drawn with replacement from the row positions of the outcome.
The function exists because most bootstrap workflows need stratified training indexes, and writing sample(seq_along(y), replace = TRUE) by hand ignores the outcome distribution. createResample() samples within each class for factor outcomes, so a resample on a 50/50 binary target stays close to 50/50 even with replacement noise. The output shape matches what caret::train() expects via trainControl(method = "boot"), so the same indexes that drive a manual bootstrap loop can be plugged straight into a train() call.
createResample() syntax and arguments
createResample() needs only an outcome vector; the rest is shape and count. Three arguments cover every option the function exposes.
The signature is short:
y: the outcome vector. Factors trigger per-class sampling with replacement; numerics are sampled across the whole vector.times: number of bootstrap resamples. 25 is the caret default fortrainControl(method = "boot"), 100 is common for stable bootstrap confidence intervals.list: ifTRUE(the default), return a named list of integer vectors, one per resample. Set toFALSEto get an integer matrix where each column is one resample.
Every element of the returned list has the same length as y, because bootstrap resamples are full-size draws with replacement, not partitions. That is the first difference from createFolds(), where fold sizes sum to the input length rather than each matching it.
resample(X, y, n_samples = len(y), stratify = y) from sklearn.utils. Both draw with replacement and stratify on a factor outcome; createResample() returns indexes in one shot and lets you generate many resamples in a single call via times.createResample() examples by use case
Most bootstrap workflows grab the list, iterate over it, and refit per resample. The examples below build from the basic list output up to a hand-rolled bootstrap loop with out-of-bag prediction.
A 10-resample bootstrap on the iris species column:
Each resample is 150 rows (same as the input) with the class counts preserved, because caret sampled with replacement inside each level of Species. Loop over the list to fit one model per resample.
The unique-row fraction is the bootstrap fingerprint:
The average sits near 0.632, which is the limit of 1 - (1 - 1/n)^n as n grows. The rows missing from each resample are the out-of-bag set used for honest error estimates in the .632 bootstrap.
The same call with list = FALSE returns a matrix:
The matrix has 150 rows (one per training-position slot) and 5 columns (one per resample). Column-major iteration with apply(boot_mat, 2, your_fn) runs slightly faster than list iteration for large times.
A hand-rolled bootstrap loop that fits a model per resample and averages out-of-bag RMSE:
The loop uses setdiff() to find rows missing from the resample, fits on the bootstrap sample, and scores on the held-out rows. Swap lm() for any modelling function; the structure is unchanged.
createResample() reads the active RNG state on every internal sample, so seeding before the function and using its output gives you the same 25 resamples every run. Seeding inside the loop after the call has no effect on the indexes; it only affects model fitting.createResample() vs createFolds() and trainControl()
createResample() draws bootstrap samples; createFolds() partitions for cross-validation; trainControl(method = "boot") runs the bootstrap for you. Pick by what kind of resampling variance you need.
| Function | Sampling | Each element | Used when |
|---|---|---|---|
createResample(y, times = 25) |
with replacement, size n | training indexes | bootstrap CI, .632 estimators |
createFolds(y, k = 10) |
without replacement, partition | test indexes | k-fold cross-validation |
createMultiFolds(y, k = 5, times = 3) |
without replacement, partition | training indexes | repeated k-fold CV |
createDataPartition(y, p = 0.7) |
without replacement, one split | training indexes | initial holdout |
trainControl(method = "boot", number = 25) |
with replacement, size n | a control object | inside caret::train() |
If you are calling train(), pass trainControl(method = "boot", number = 25) and caret runs createResample() internally. Call createResample() directly when you need a bootstrap loop outside train(), such as bootstrapping a custom statistic (a median, a quantile, a feature-importance score) where you control the per-resample computation.
Common pitfalls
Three mistakes cause most bootstrap bugs. Each one shows up in the resample diagnostics before model fitting.
The first is treating the list elements as test indexes. createResample() returns TRAINING indexes; the out-of-bag rows are setdiff(seq_along(y), boots[[i]]). Indexing mtcars[boots[[1]], ] gives you the training set, not the holdout. This is the opposite of createFolds(), where the default list holds test indexes.
The second is forgetting that duplicated rows inflate row-weighted losses. A row appearing 4 times in a resample contributes 4x to the loss. That is correct bootstrap behaviour for trees and regression, but weighted estimators may need a frequency vector instead.
createResample() corrects in-sample optimism, but it does not replace a true holdout when comparing models across many tuning configurations. Keep a createDataPartition() cut on top of any bootstrap workflow you plan to publish.The third is comparing models with different set.seed() values upstream of createResample(). Two different seeds give two different sets of 25 resamples, so any RMSE difference between models is partly resample noise rather than a real model effect. Seed once before the resample call and freeze the resample list for every model you compare.
Try it yourself
Try it: Build a 50-resample bootstrap on iris$Species and confirm the average unique-row fraction sits near 0.632.
Click to reveal solution
Explanation: The unique-row fraction converges to 1 - 1/e (about 0.632) as n grows. With 50 resamples on a 150-row vector, the empirical mean lands close to that limit, confirming bootstrap behaviour.
Related caret functions
Caret ships a small family of resampler functions. createResample() is the one for bootstrap; the others handle non-bootstrap resampling strategies.
createFolds(y, k = 10): k-fold cross-validation indexes (partition, no replacement).createMultiFolds(y, k = 5, times = 3): repeated k-fold formethod = "repeatedcv".createDataPartition(y, p = 0.7): one stratified train/test split before any resampling.createTimeSlices(y, initialWindow, horizon): rolling-origin folds for time-series outcomes.trainControl(method = "boot", number = 25): the wrapper that callscreateResample()internally insidetrain().
For an authoritative argument reference see the caret documentation on data splitting. The usual pipeline splits once with createDataPartition(), then resamples the training partition via trainControl(method = "boot") or "cv"; call createResample() directly only for manual bootstrap loops.
FAQ
What is the difference between createResample() and createFolds()?
createResample() draws bootstrap samples with replacement, so each resample is the same size as the input vector and some rows appear multiple times. createFolds() partitions the data without replacement into k disjoint groups so each row lands in exactly one fold. Use createResample() for bootstrap variance estimates and .632-style error correction; use createFolds() for k-fold cross-validation where every row should be held out exactly once.
Why does createResample() return training indexes by default?
Most bootstrap workflows refit a model on each resample, so caret defaulted the list to training indexes for convenience. The convention is the opposite of createFolds(), which defaults to test indexes because CV loops typically iterate over the held-out fold. Compute the out-of-bag set with setdiff(seq_along(y), boots[[i]]) whenever you need the rows not drawn into a resample.
How many resamples should I pass to times?
25 is the caret default and gives a stable mean estimate, but bootstrap confidence intervals usually need 100 or more resamples to settle. Push to 500 or 1000 whenever you report a CI; the cost is linear in times. For very expensive fits, 25 is a reasonable compromise.
Does createResample() stratify numeric outcomes?
No. Stratification only kicks in for factor outcomes; numerics are sampled across the whole vector with replacement, ignoring distribution. For quantile-balanced bootstrap on a numeric target, bin the outcome first with cut(y, breaks = quantile(y)) and pass the binned factor, or use rsample::bootstraps() which offers stratified numeric resampling.
Can I pass createResample() output directly to trainControl()?
Yes, via trainControl(index = my_boots) where my_boots is the list returned by createResample(y, times = 25). This reuses the same resamples across multiple train() calls so model comparisons are not contaminated by resample randomness.