Random Forest Exercises in R: 20 Practice Problems
Twenty practice problems on Random Forest in R: classification, regression, tuning, variable importance, ranger. Hidden solutions.
By Selva Prabhakaran · Published May 11, 2026 · Last updated May 11, 2026
library(randomForest)
library(ranger)
library(caret)
library(dplyr)
Exercise 1: Classification RF on iris
Difficulty: Beginner.
Show solution
set.seed(1)
randomForest(Species ~ ., data = iris)
Exercise 2: Regression RF on mtcars
Difficulty: Beginner.
Show solution
set.seed(1)
randomForest(mpg ~ ., data = mtcars)
Exercise 3: Specify number of trees
Difficulty: Beginner.
Show solution
set.seed(1)
randomForest(Species ~ ., data = iris, ntree = 100)
Exercise 4: Specify mtry
Difficulty: Intermediate.
Show solution
set.seed(1)
randomForest(Species ~ ., data = iris, mtry = 2)
Exercise 5: OOB error
Difficulty: Intermediate.
Show solution
set.seed(1)
fit <- randomForest(Species ~ ., data = iris)
fit$err.rate[nrow(fit$err.rate), 1]
Exercise 6: Variable importance
Difficulty: Intermediate.
Show solution
set.seed(1)
fit <- randomForest(Species ~ ., data = iris, importance = TRUE)
importance(fit)
Exercise 7: Variable importance plot
Difficulty: Intermediate.
Show solution
set.seed(1)
fit <- randomForest(Species ~ ., data = iris)
varImpPlot(fit)
Exercise 8: Predict probabilities
Difficulty: Intermediate.
Show solution
set.seed(1)
fit <- randomForest(Species ~ ., data = iris)
head(predict(fit, iris, type = "prob"))
Exercise 9: Confusion matrix
Difficulty: Intermediate.
Show solution
set.seed(1)
fit <- randomForest(Species ~ ., data = iris)
fit$confusion
Exercise 10: ranger basic
Difficulty: Intermediate. Faster than randomForest.
Show solution
set.seed(1)
ranger(Species ~ ., data = iris)
Exercise 11: ranger importance
Difficulty: Intermediate.
Show solution
set.seed(1)
fit <- ranger(Species ~ ., data = iris, importance = "permutation")
importance(fit)
Exercise 12: Tune mtry with caret
Difficulty: Advanced.
Show solution
set.seed(1)
train(Species ~ ., data = iris, method = "rf",
tuneGrid = expand.grid(mtry = c(1, 2, 3, 4)),
trControl = trainControl(method = "cv", number = 5))
Exercise 13: Cross-validated RF RMSE
Difficulty: Advanced.
Show solution
set.seed(1)
train(mpg ~ ., data = mtcars, method = "rf",
trControl = trainControl(method = "cv", number = 5))
Exercise 14: Train-test split + RMSE
Difficulty: Intermediate.
Show solution
set.seed(1)
idx <- sample(seq_len(nrow(mtcars)), 22)
tr <- mtcars[idx, ]; te <- mtcars[-idx, ]
fit <- randomForest(mpg ~ ., data = tr)
sqrt(mean((te$mpg - predict(fit, te))^2))
Exercise 15: Class weights for imbalance
Difficulty: Advanced.
Show solution
set.seed(1)
randomForest(Species ~ ., data = iris, classwt = c(1, 1, 2))
Exercise 16: Stratified sampling
Difficulty: Advanced.
Show solution
set.seed(1)
randomForest(Species ~ ., data = iris, sampsize = c(20, 20, 20),
strata = iris$Species)
Exercise 17: Partial dependence
Difficulty: Advanced.
Show solution
set.seed(1)
fit <- randomForest(mpg ~ ., data = mtcars)
partialPlot(fit, mtcars, "wt")
Exercise 18: Predict on new data
Difficulty: Beginner.
Show solution
set.seed(1)
fit <- randomForest(mpg ~ ., data = mtcars)
predict(fit, mtcars[1:3, ])
Exercise 19: nodesize parameter
Difficulty: Intermediate.
Show solution
set.seed(1)
randomForest(Species ~ ., data = iris, nodesize = 5)
Exercise 20: Compare to logistic regression
Difficulty: Advanced.
Show solution
set.seed(1)
binary <- iris |> dplyr::mutate(y = as.integer(Species == "virginica"))
fit_glm <- glm(y ~ Sepal.Length + Petal.Length, data = binary, family = binomial)
fit_rf <- randomForest(factor(y) ~ Sepal.Length + Petal.Length, data = binary)
list(glm_acc = mean(round(predict(fit_glm, type = "response")) == binary$y),
rf_acc = mean(predict(fit_rf) == factor(binary$y)))
What to do next
- XGBoost-Exercises (coming), gradient boosting alternative.
- Machine-Learning-Exercises (shipped), broader ML drills.