Survey Analysis in R Exercises: 15 Practice Problems

Fifteen practice problems on survey data analysis in R: design objects, weights, strata, replicate weights, regression. Hidden solutions.

RRun this once before any exercise
library(survey) library(dplyr)

  

Exercise 1: Build svydesign

Difficulty: Intermediate.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) des

  

Exercise 2: Weighted mean

Difficulty: Beginner.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) svymean(~api00, des)

  

Exercise 3: Weighted total

Difficulty: Intermediate.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) svytotal(~enroll, des)

  

Exercise 4: Stratum-specific means

Difficulty: Intermediate.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) svyby(~api00, ~stype, des, svymean)

  

Exercise 5: Weighted proportion

Difficulty: Intermediate.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) svymean(~factor(awards), des)

  

Exercise 6: Confidence intervals

Difficulty: Intermediate.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) confint(svymean(~api00, des))

  

Exercise 7: Weighted regression

Difficulty: Advanced.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) svyglm(api00 ~ meals, des) |> summary()

  

Exercise 8: Weighted chi-square

Difficulty: Advanced.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) svychisq(~awards + stype, des)

  

Exercise 9: Two-stage cluster

Difficulty: Advanced.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~dnum + snum, weights = ~pw, data = apiclus2) svymean(~api00, des)

  

Exercise 10: Replicate weights

Difficulty: Advanced.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) rep_des <- as.svrepdesign(des, type = "bootstrap", replicates = 100) svymean(~api00, rep_des)

  

Exercise 11: Quantiles

Difficulty: Intermediate.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) svyquantile(~api00, des, c(0.25, 0.5, 0.75))

  

Exercise 12: Cross-tab with row %

Difficulty: Advanced.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) svytable(~stype + awards, des, Ntotal = 100) |> prop.table(1)

  

Exercise 13: Subset of design

Difficulty: Intermediate.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) es <- subset(des, stype == "E") svymean(~api00, es)

  

Exercise 14: Effective sample size

Difficulty: Advanced.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) # Design effect = var(svymean)/var(srs) # Effective n = n / deff

  

Exercise 15: Post-stratification

Difficulty: Advanced.

Show solution
RInteractive R
data(api, package = "survey") des <- svydesign(id = ~1, weights = ~pw, data = apistrat, strata = ~stype) pop <- data.frame(stype = c("E","H","M"), Freq = c(4421, 755, 1018)) post_des <- postStratify(des, ~stype, pop) svymean(~api00, post_des)

  

What to do next

  • Hypothesis-Testing-Exercises (shipped), broader inference.
  • R-for-Biostatistics-Exercises (shipped), applied stats.