tidymodels Exercises in R: 25 Practice Problems
Twenty-five practice problems on the tidymodels stack: rsample, recipes, parsnip, workflows, tune, yardstick. Hidden solutions.
By Selva Prabhakaran · Published May 11, 2026 · Last updated May 11, 2026
library(tidymodels)
library(dplyr)
library(yardstick)
Exercise 1: initial_split
Difficulty: Beginner.
Show solution
set.seed(1)
split <- initial_split(mtcars, prop = 0.7)
list(train = nrow(training(split)), test = nrow(testing(split)))
Exercise 2: vfold_cv
Difficulty: Beginner.
Show solution
set.seed(1)
vfold_cv(mtcars, v = 5)
Exercise 3: recipe
Difficulty: Intermediate.
Show solution
recipe(mpg ~ ., data = mtcars)
Exercise 4: step_normalize
Difficulty: Intermediate.
Show solution
recipe(mpg ~ ., data = mtcars) |>
step_normalize(all_numeric_predictors())
Exercise 5: prep + bake
Difficulty: Intermediate.
Show solution
rec <- recipe(mpg ~ ., data = mtcars) |>
step_normalize(all_numeric_predictors())
prep(rec) |> bake(new_data = mtcars) |> head()
Exercise 6: linear_reg model
Difficulty: Beginner.
Show solution
linear_reg() |> set_engine("lm")
Exercise 7: rand_forest model
Difficulty: Intermediate.
Show solution
rand_forest(trees = 100) |> set_mode("regression") |> set_engine("ranger")
Exercise 8: boost_tree model
Difficulty: Intermediate.
Show solution
boost_tree(trees = 100) |> set_mode("regression") |> set_engine("xgboost")
Exercise 9: workflow
Difficulty: Intermediate.
Show solution
wf <- workflow() |>
add_recipe(recipe(mpg ~ ., data = mtcars)) |>
add_model(linear_reg() |> set_engine("lm"))
wf
Exercise 10: fit workflow
Difficulty: Intermediate.
Show solution
wf <- workflow() |> add_recipe(recipe(mpg ~ ., data = mtcars)) |>
add_model(linear_reg() |> set_engine("lm"))
fit(wf, mtcars)
Exercise 11: Predict from workflow
Difficulty: Intermediate.
Show solution
wf <- workflow() |> add_recipe(recipe(mpg ~ ., data = mtcars)) |>
add_model(linear_reg() |> set_engine("lm"))
fitted <- fit(wf, mtcars)
predict(fitted, new_data = mtcars[1:3, ])
Exercise 12: fit_resamples
Difficulty: Advanced.
Show solution
set.seed(1)
folds <- vfold_cv(mtcars, v = 5)
wf <- workflow() |> add_recipe(recipe(mpg ~ ., data = mtcars)) |>
add_model(linear_reg() |> set_engine("lm"))
fit_resamples(wf, folds, metrics = metric_set(rmse, rsq))
Exercise 13: collect_metrics
Difficulty: Advanced.
Show solution
set.seed(1)
folds <- vfold_cv(mtcars, v = 5)
wf <- workflow() |> add_formula(mpg ~ .) |> add_model(linear_reg() |> set_engine("lm"))
fit_resamples(wf, folds) |> collect_metrics()
Exercise 14: Tune hyperparameter
Difficulty: Advanced.
Show solution
set.seed(1)
folds <- vfold_cv(mtcars, v = 5)
rf <- rand_forest(mtry = tune(), trees = 100) |> set_mode("regression") |> set_engine("ranger")
wf <- workflow() |> add_formula(mpg ~ .) |> add_model(rf)
grid <- expand.grid(mtry = c(2, 4, 6))
tune_grid(wf, folds, grid = grid) |> collect_metrics()
Exercise 15: yardstick metrics
Difficulty: Intermediate.
Show solution
truth <- c(1, 2, 3, 4); pred <- c(1.1, 1.9, 3.2, 3.8)
data.frame(truth, pred) |> yardstick::rmse(truth, pred)
Exercise 16: Classification: logistic
Difficulty: Intermediate.
Show solution
binary <- iris |> dplyr::filter(Species != "setosa") |>
dplyr::mutate(Species = droplevels(Species))
mod <- logistic_reg() |> set_engine("glm")
fit(workflow() |> add_formula(Species ~ Sepal.Length) |> add_model(mod), binary)
Exercise 17: step_dummy
Difficulty: Intermediate.
Show solution
recipe(mpg ~ ., data = mtcars |> dplyr::mutate(cyl = factor(cyl))) |>
step_dummy(all_nominal_predictors())
Exercise 18: step_corr (remove correlated)
Difficulty: Advanced.
Show solution
recipe(mpg ~ ., data = mtcars) |>
step_corr(all_numeric_predictors(), threshold = 0.9)
Exercise 19: step_pca
Difficulty: Advanced.
Show solution
recipe(mpg ~ ., data = mtcars) |>
step_normalize(all_numeric_predictors()) |>
step_pca(all_numeric_predictors(), num_comp = 3)
Exercise 20: last_fit
Difficulty: Advanced.
Show solution
set.seed(1)
split <- initial_split(mtcars, prop = 0.7)
wf <- workflow() |> add_formula(mpg ~ .) |> add_model(linear_reg() |> set_engine("lm"))
last_fit(wf, split) |> collect_metrics()
Exercise 21: select_best after tuning
Difficulty: Advanced.
Show solution
# After tune_grid result `res`:
# best <- select_best(res, "rmse")
# finalize_workflow(wf, best)
Exercise 22: workflow_set for many models
Difficulty: Advanced.
Show solution
ws <- workflow_set(
preproc = list(rec = recipe(mpg ~ ., data = mtcars)),
models = list(lm = linear_reg() |> set_engine("lm"),
rf = rand_forest() |> set_mode("regression") |> set_engine("ranger"))
)
Exercise 23: parsnip translate
Difficulty: Advanced.
Show solution
linear_reg() |> set_engine("lm") |> translate()
Exercise 24: tidy a fit
Difficulty: Intermediate.
Show solution
wf <- workflow() |> add_formula(mpg ~ .) |> add_model(linear_reg() |> set_engine("lm"))
fit(wf, mtcars) |> extract_fit_parsnip() |> tidy()
Exercise 25: Save model object
Difficulty: Intermediate.
Show solution
wf <- workflow() |> add_formula(mpg ~ .) |> add_model(linear_reg() |> set_engine("lm"))
saveRDS(fit(wf, mtcars), "wf.rds")
What to do next
- Machine-Learning-Exercises (shipped), broader ML.
- Cross-Validation-Exercises (shipped), CV deep dive.