Sampling Methods Exercises in R: 20 Practice Problems
Twenty practice problems on sampling in R: simple random, stratified, cluster, bootstrap, jackknife, permutation. Solutions hidden.
By Selva Prabhakaran · Published May 11, 2026 · Last updated May 11, 2026
library(dplyr)
library(caret)
Exercise 1: SRS without replacement
Difficulty: Beginner.
Show solution
set.seed(1); sample(1:100, 10, replace = FALSE)
Exercise 2: SRS with replacement
Difficulty: Beginner.
Show solution
set.seed(1); sample(1:100, 10, replace = TRUE)
Exercise 3: Sample rows of a data frame
Difficulty: Beginner.
Show solution
set.seed(1); mtcars |> slice_sample(n = 5)
Exercise 4: Sample proportion
Difficulty: Beginner.
Show solution
set.seed(1); mtcars |> slice_sample(prop = 0.2)
Exercise 5: Stratified sample per group
Difficulty: Intermediate.
Show solution
set.seed(1); iris |> slice_sample(n = 5, by = Species)
Exercise 6: Stratified prop sample
Difficulty: Intermediate.
Show solution
set.seed(1); iris |> slice_sample(prop = 0.2, by = Species)
Exercise 7: Weighted sample
Difficulty: Advanced. Weight by hp.
Show solution
set.seed(1); mtcars |> slice_sample(n = 10, weight_by = hp)
Exercise 8: createDataPartition (caret)
Difficulty: Intermediate.
Show solution
set.seed(1)
idx <- caret::createDataPartition(iris$Species, p = 0.7, list = FALSE)
length(idx)
Exercise 9: Bootstrap CI for the mean
Difficulty: Intermediate.
Show solution
set.seed(1)
m <- replicate(2000, mean(sample(mtcars$mpg, replace = TRUE)))
quantile(m, c(0.025, 0.975))
Exercise 10: Bootstrap CI for the median
Difficulty: Intermediate.
Show solution
set.seed(1)
m <- replicate(2000, median(sample(mtcars$mpg, replace = TRUE)))
quantile(m, c(0.025, 0.975))
Exercise 11: Bootstrap with boot package
Difficulty: Advanced.
Show solution
library(boot)
set.seed(1)
b <- boot(mtcars$mpg, function(d, i) mean(d[i]), R = 1000)
boot.ci(b, type = "bca")
Exercise 12: Jackknife
Difficulty: Advanced.
Show solution
n <- nrow(mtcars)
jack <- sapply(1:n, function(i) mean(mtcars$mpg[-i]))
mean(jack) # jackknife estimate
Exercise 13: Permutation test for two means
Difficulty: Advanced.
Show solution
set.seed(1)
obs <- diff(by(mtcars$mpg, mtcars$am, mean))
perms <- replicate(2000, {
am <- sample(mtcars$am)
diff(by(mtcars$mpg, am, mean))
})
mean(abs(perms) >= abs(obs))
Exercise 14: Resampling for SE
Difficulty: Intermediate.
Show solution
set.seed(1)
m <- replicate(1000, mean(sample(mtcars$mpg, replace = TRUE)))
sd(m)
Exercise 15: Cluster sampling demo
Difficulty: Advanced.
Show solution
set.seed(1)
clusters <- unique(mtcars$cyl)
chosen <- sample(clusters, 2)
mtcars |> filter(cyl %in% chosen)
Exercise 16: K-fold split indices
Difficulty: Intermediate.
Show solution
set.seed(1)
folds <- sample(rep(1:5, length.out = nrow(mtcars)))
table(folds)
Exercise 17: Train-test 70/30
Difficulty: Beginner.
Show solution
set.seed(1)
idx <- sample(seq_len(nrow(mtcars)), 0.7 * nrow(mtcars))
list(train = nrow(mtcars[idx,]), test = nrow(mtcars[-idx,]))
Exercise 18: Repeated bootstrap
Difficulty: Advanced.
Show solution
set.seed(1)
results <- sapply(1:5, function(seed) {
set.seed(seed)
mean(replicate(500, mean(sample(mtcars$mpg, replace = TRUE))))
})
results
Exercise 19: Systematic sample
Difficulty: Advanced.
Show solution
step <- floor(nrow(mtcars) / 5)
mtcars[seq(1, nrow(mtcars), by = step), ]
Exercise 20: Reservoir sampling concept
Difficulty: Advanced.
Show solution
set.seed(1)
# Simple equivalent: random sample of fixed size from stream
reservoir <- sample(1:100, 5)
reservoir
What to do next
- Cross-Validation-Exercises (coming), CV builds on sampling.
- Hypothesis-Testing-Exercises (shipped), permutation/bootstrap tests.