R for Marketing Analytics Exercises: 20 Practice Problems

Twenty practice problems for marketing analytics in R: RFM, customer segmentation, churn, attribution, lift, conversion analysis. Hidden solutions.

RRun this once before any exercise
library(dplyr) library(tibble) library(tidyr) library(forecast) library(lubridate)

  

Exercise 1: Compute RFM

Difficulty: Advanced.

Show solution
RInteractive R
txns <- tibble(user = c("a","a","b","b","c"), date = as.Date(c("2024-01-05","2024-03-10","2024-02-20","2024-04-15","2024-01-15")), amount = c(50, 80, 100, 90, 30)) ref <- as.Date("2024-05-01") txns |> group_by(user) |> summarise(R = as.integer(ref - max(date)), F = n(), M = sum(amount), .groups = "drop")

  

Exercise 2: RFM score (terciles)

Difficulty: Advanced.

Show solution
RInteractive R
df <- tibble(user = letters[1:6], R = c(10, 30, 5, 50, 20, 80), F = c(5, 3, 8, 1, 4, 2), M = c(200, 150, 300, 100, 180, 90)) df |> mutate(R_s = ntile(-R, 3), F_s = ntile(F, 3), M_s = ntile(M, 3), RFM = paste0(R_s, F_s, M_s))

  

Exercise 3: Conversion rate

Difficulty: Beginner.

Show solution
RInteractive R
visitors <- 1000; conversions <- 35 conversions / visitors

  

Exercise 4: Average order value

Difficulty: Beginner.

Show solution
RInteractive R
orders <- c(50, 80, 30, 120, 65) mean(orders)

  

Exercise 5: Customer lifetime value (simple)

Difficulty: Intermediate. AOV orders/year years.

Show solution
RInteractive R
aov <- 80; freq <- 4; lifespan <- 3 aov * freq * lifespan

  

Exercise 6: Churn rate

Difficulty: Beginner.

Show solution
RInteractive R
churned <- 50; active_start <- 500 churned / active_start

  

Exercise 7: Cohort retention table

Difficulty: Advanced.

Show solution
RInteractive R
events <- tibble(user = c(1,1,2,2,3,3), month = c(1,2,1,3,1,2)) first <- events |> group_by(user) |> summarise(cohort = min(month), .groups = "drop") events |> inner_join(first, by = "user") |> count(cohort, month) |> pivot_wider(names_from = month, values_from = n, values_fill = 0)

  

Exercise 8: A/B test conversion (prop.test)

Difficulty: Intermediate.

Show solution
RInteractive R
prop.test(c(120, 100), c(2000, 2000))

  

Exercise 9: Lift calculation

Difficulty: Intermediate.

Show solution
RInteractive R
ctr_a <- 0.06; ctr_b <- 0.05 (ctr_a - ctr_b) / ctr_b

  

Exercise 10: First-touch attribution

Difficulty: Advanced.

Show solution
RInteractive R
touches <- tibble(user = c(1,1,1,2,2,3), channel = c("ad","email","direct","ad","direct","email"), rank = c(1,2,3,1,2,1)) touches |> filter(rank == 1) |> count(channel, name = "first_touch")

  

Exercise 11: Last-touch attribution

Difficulty: Advanced.

Show solution
RInteractive R
touches <- tibble(user = c(1,1,1,2,2,3), channel = c("ad","email","direct","ad","direct","email")) touches |> group_by(user) |> slice_tail(n = 1) |> ungroup() |> count(channel)

  

Exercise 12: Customer segmentation with k-means

Difficulty: Advanced.

Show solution
RInteractive R
set.seed(1) df <- tibble(spend = runif(100, 10, 500), freq = sample(1:20, 100, replace = TRUE)) km <- kmeans(scale(df), 3) df$segment <- km$cluster head(df)

  

Exercise 13: Daily active users

Difficulty: Beginner.

Show solution
RInteractive R
events <- tibble(user = c(1,2,3,1,2), date = as.Date(c("2024-01-01","2024-01-01","2024-01-01","2024-01-02","2024-01-03"))) events |> group_by(date) |> summarise(dau = n_distinct(user))

  

Exercise 14: Weekly active users

Difficulty: Intermediate.

Show solution
RInteractive R
events <- tibble(user = c(1,2,1,3,2), date = as.Date(c("2024-01-01","2024-01-02","2024-01-08","2024-01-10","2024-01-15"))) events |> mutate(week = lubridate::floor_date(date, "week")) |> group_by(week) |> summarise(wau = n_distinct(user))

  

Exercise 15: Funnel conversion

Difficulty: Advanced.

Show solution
RInteractive R
funnel <- tibble(stage = c("view","click","add_to_cart","purchase"), users = c(10000, 3000, 1200, 450)) funnel |> mutate(conv = users / lag(users))

  

Exercise 16: Compare two campaigns by t-test

Difficulty: Intermediate.

Show solution
RInteractive R
camp_a <- c(120, 110, 130, 105, 95) camp_b <- c(90, 85, 100, 92, 88) t.test(camp_a, camp_b)

  

Exercise 17: Forecast monthly sales

Difficulty: Advanced.

Show solution
RInteractive R
sales <- ts(c(100, 110, 115, 120, 125, 130, 140, 135, 145, 150), frequency = 12) forecast::auto.arima(sales) |> forecast::forecast(h = 6)

  

Exercise 18: Geo aggregation

Difficulty: Intermediate.

Show solution
RInteractive R
sales <- tibble(region = c("US","EU","ASIA","US","EU"), revenue = c(100, 80, 60, 110, 90)) sales |> group_by(region) |> summarise(rev = sum(revenue)) |> arrange(desc(rev))

  

Exercise 19: Email open rate

Difficulty: Beginner.

Show solution
RInteractive R
sent <- 10000; opened <- 2200 opened / sent

  

Exercise 20: ROI per channel

Difficulty: Intermediate.

Show solution
RInteractive R
channels <- tibble(channel = c("ad","email","social"), spend = c(5000, 1000, 2000), rev = c(15000, 4000, 5000)) channels |> mutate(roi = (rev - spend) / spend)

  

What to do next

  • A-B-Testing-Exercises (coming), experiment design drills.
  • EDA-Exercises (shipped), pre-modeling exploration.