rsample rolling_origin() in R: Time-Series Resampling
The rsample rolling_origin() function in R builds time-ordered resamples for forecasting, so each held-out window comes strictly after its training window in time and your backtest never leaks future information into the model.
rolling_origin(df, initial = 100) # default expanding window rolling_origin(df, initial = 100, assess = 10) # 10-row forecast horizon rolling_origin(df, initial = 100, assess = 10, skip = 9) # non-overlapping origins rolling_origin(df, initial = 100, cumulative = FALSE) # sliding window rolling_origin(df, initial = 100, lag = 5) # gap between train and test analysis(splits$splits[[1]]) # training rows of split 1 assessment(splits$splits[[1]]) # held-out rows of split 1
Need explanation? Read on for examples and pitfalls.
What rolling_origin() does
rolling_origin() resamples a time-ordered data frame into a sequence of train and test windows. It belongs to the rsample package, the resampling engine of the tidymodels ecosystem. Unlike vfold_cv(), which shuffles rows before partitioning, rolling_origin() preserves row order. Each split's assessment rows sit immediately after its analysis rows in the original data, mimicking how a real forecast is produced one step at a time.
The function returns a tibble of rsplit objects in a splits list-column. The first split trains on rows 1 through initial, then assesses on the next assess rows. The next split extends or slides the training window by skip + 1 rows, and so on until the data runs out. This pattern is the backtesting workflow for any forecasting model: ARIMA, prophet, exponential smoothing, or an ML model with lagged features.
Syntax and arguments
The signature has one required argument and five tuning knobs that control window shape.
The arguments that matter in practice:
- data: a tibble or data frame sorted chronologically. The function does not check ordering, so sort before calling.
- initial: the size of the first training window. Pick large enough to fit your model reliably.
- assess: the forecast horizon in rows. 1 for one-step-ahead, 7 for a week of daily data, 12 for a year of monthly data.
- cumulative:
TRUE(default) grows the training window each split;FALSEslides a fixed-size window forward. - skip: rows skipped between origins. Use
skip = assess - 1to get non-overlapping assessment windows. - lag: rows held out between train and test. Set to your longest feature lag to prevent leakage.
rolling_origin() examples
Basic expanding window backtest
Call rolling_origin() on a time-ordered data frame and the result is a tibble of splits. The default cumulative = TRUE grows the training window by one row each split.
Each <split [n/5]> grows the training portion by one row as the origin advances. The assessment window stays at 5 rows.
Extract analysis and assessment sets
Use analysis() and assessment() to materialize one split. Both return ordinary data frames you can feed to any modeling function.
The training rows stop at t = 20 and the assessment rows pick up at t = 21. Chronology is preserved.
Sliding window instead of expanding
Set cumulative = FALSE to slide a fixed-size training window forward. Useful when older data is no longer representative, such as after a regime change.
Every training window holds exactly 20 rows. With cumulative = TRUE the same call would produce windows of 20, 21, 22, and so on.
Non-overlapping assessment windows with skip
Use skip = assess - 1 to make assessment windows non-overlapping. This is the standard setup when reporting one metric per disjoint forecast horizon.
The two assessment windows cover t = 21:25 and t = 26:30 with no overlap, exactly what backtesting protocols recommend.
rolling_origin() vs other resampling functions
rolling_origin() preserves time order; every other rsample resample shuffles or replaces. Choose based on whether the data has a meaningful sequence.
| Function | Produces | Use when |
|---|---|---|
rolling_origin() |
Time-ordered analysis/assessment splits | Forecasting, backtesting, sequential data |
vfold_cv() |
v shuffled folds, every row held out once | Cross-sectional data, hyperparameter tuning |
initial_split() |
One train/test split | Final hold-out, not iterative backtest |
bootstraps() |
Resamples with replacement | Small data, variance estimates |
mc_cv() |
Random train/test splits | Many resamples without v-fold structure |
sliding_period() |
Calendar-aware sliding windows | Splits keyed to dates, not row counts |
A typical time-series tidymodels workflow combines rolling_origin() with fit_resamples() to produce a backtest report across many horizons. For a final hold-out, take the latest assess rows out with initial_time_split().
initial = 30 does not mean 30 days of history. Pre-aggregate or use sliding_period() from the slider package when the calendar matters.Common pitfalls
Three mistakes account for most rolling_origin() bugs.
- Forgetting to sort by time.
rolling_origin()trusts the data's row order. If rows arrive ungrouped or scrambled, the analysis window mixes past and future and the backtest is meaningless. Alwaysarrange(date)before calling the function. - Setting assess too small relative to the model's forecast goal. A model that will forecast 12 months ahead in production should be assessed on 12-row windows, not 1-row windows. The metric you publish should match the horizon you serve.
- Ignoring the lag argument when features include lags. A model that uses
lag_7of the target leaks the future if the training set ends at row 100 and the test set starts at row 101. Setlag = 7to insert a 7-row buffer that mirrors the operational delay between training cutoff and prediction.
initial, assess, cumulative, skip, and lag. A newer companion function sliding_period() from slider plus rsample handles calendar-aware splits and is recommended for daily, weekly, or monthly data with irregular gaps.Try it yourself
Try it: Build a rolling-origin resample of AirPassengers (the classic monthly time series) with a 60-row initial training window, a 12-row forecast horizon, non-overlapping assessment windows, and an expanding training window. Save it to ex_splits.
Click to reveal solution
Explanation: initial = 60 reserves 5 years of monthly history for the first training window. assess = 12 evaluates a one-year forecast. skip = 11 makes the next origin start 12 rows later, so the seven assessment windows tile the remaining data without overlap.
Related rsample functions
rolling_origin() is the time-series workhorse; these functions extend it.
analysis()andassessment(): extract the two data frames from a split object.initial_time_split(): carve off the most recent rows as a final hold-out before backtesting.sliding_period(): calendar-aware splits keyed to dates (days, weeks, months) instead of row counts.group_vfold_cv(): keep all rows of a group together; useful for panel forecasting.int_pctl(): build percentile confidence intervals from resampled metrics.
rolling_origin(df, initial = n, assess = m, cumulative = TRUE) is the tidymodels equivalent of TimeSeriesSplit(n_splits=k). The cumulative = FALSE setting matches the max_train_size argument; lag matches the gap parameter.FAQ
What is the difference between rolling_origin() and vfold_cv()?
vfold_cv() shuffles rows before partitioning, which destroys time order. Every row appears in the held-out set once across the folds, but the training and test rows are randomly mixed. rolling_origin() keeps rows in their original order so the assessment rows of each split come strictly after the analysis rows. Use vfold_cv() for cross-sectional data and rolling_origin() for any series with a meaningful temporal ordering.
Does rolling_origin() handle multiple time series?
Not directly. The function treats the data frame as one ordered sequence and does not segment by group. For panel data with several series in long format, group the data first with nest() and apply rolling_origin() within each group, or use sliding_period() from slider with a group_by() upstream. For tidymodels-compatible panel resamples, group_vfold_cv() is the simpler alternative when strict time order is not required within group.
How do I make rolling_origin() reproducible?
rolling_origin() is deterministic. It does not shuffle, so the same data and arguments always produce the same splits and no set.seed() call is needed. The only randomness in a backtest workflow comes from the model itself (random forests, neural networks) or from companion resamples like bootstraps(). Set seeds before those steps, not before rolling_origin().
What value should I choose for initial?
Pick the smallest training window where your model fits stably and produces sensible coefficients. For ARIMA and exponential smoothing models on daily data, 60 to 90 rows is a common floor. For ML models with engineered features, you need enough rows to cover at least one full seasonal cycle plus burn-in for lagged features. If initial is too small the first few backtest scores will be unreliable; if too large you waste data that could have been assessed.
Can rolling_origin() produce overlapping assessment windows?
Yes, the default skip = 0 does exactly that. Each origin advances by one row, so consecutive assessment windows overlap by assess - 1 rows. Overlap is fine when averaging metrics across many origins. For one independent score per horizon, set skip = assess - 1 so the assessment windows tile the data without overlap.