Practice writing R functions with 10 exercises: simple calculators, input validation, default arguments, returning multiple values, and vectorized functions. Each problem has interactive code and a solution.
These exercises build your function-writing skills progressively. Easy exercises (1-4) cover basic function structure. Medium (5-7) add validation and multiple returns. Hard (8-10) require design thinking.
Easy (1-4): Basic Functions
Exercise 1: Unit Converter
Write a function cm_to_inches(cm) that converts centimeters to inches (1 inch = 2.54 cm). Test it on single values and a vector.
# Exercise 1: cm to inches converter
Click to reveal solution
cm_to_inches <- function(cm) {
cm / 2.54
}
cat("10 cm =", round(cm_to_inches(10), 2), "inches\n")
cat("100 cm =", round(cm_to_inches(100), 2), "inches\n")
# Works on vectors automatically!
heights_cm <- c(160, 170, 175, 180, 185)
cat("Heights:", round(cm_to_inches(heights_cm), 1), "inches\n")
Key concept: Because R math is vectorized, the function works on both single values and vectors — no extra code needed.
Exercise 2: Greeting with Defaults
Write a function greet(name, greeting = "Hello", punctuation = "!") that returns a formatted greeting string. Test with and without default arguments.
Write safe_divide(a, b) that: validates both inputs are numeric, returns NA (not Inf) when b is 0, returns a named list with result, valid, and message.
safe_divide <- function(a, b) {
# Validate types
if (!is.numeric(a) || !is.numeric(b)) {
return(list(result = NA, valid = FALSE, message = "Inputs must be numeric"))
}
# Check for zero
if (b == 0) {
return(list(result = NA, valid = FALSE, message = "Division by zero"))
}
list(result = a / b, valid = TRUE, message = "OK")
}
# Test cases
tests <- list(
list(a = 10, b = 3),
list(a = 10, b = 0),
list(a = "ten", b = 5),
list(a = 100, b = -4)
)
for (t in tests) {
r <- safe_divide(t$a, t$b)
cat(sprintf("safe_divide(%s, %s) -> %s (%s)\n",
as.character(t$a), as.character(t$b),
if (is.na(r$result)) "NA" else round(r$result, 2), r$message))
}
Key concept: Return a list for multiple outputs. Use early return() for validation failures — it keeps the main logic clean.
Exercise 6: Flexible Summary Function
Write column_summary(df, func = mean, ...) that applies a function to all numeric columns of a data frame and passes extra arguments via ....
# Exercise 6: Apply a function to all numeric columns
# column_summary(mtcars, mean) -> means of all columns
# column_summary(mtcars, quantile, probs = 0.75) -> 75th percentile of all columns
Click to reveal solution
column_summary <- function(df, func = mean, ...) {
# Select only numeric columns
numeric_cols <- sapply(df, is.numeric)
df_numeric <- df[, numeric_cols]
# Apply the function to each column
result <- sapply(df_numeric, func, ...)
return(round(result, 2))
}
# Test with different functions
cat("Means:\n")
print(column_summary(mtcars[, 1:5], mean))
cat("\nMedians:\n")
print(column_summary(mtcars[, 1:5], median))
cat("\nStandard deviations:\n")
print(column_summary(mtcars[, 1:5], sd))
cat("\nTrimmed means (10%):\n")
print(column_summary(mtcars[, 1:5], mean, trim = 0.1))
Key concept:... (dot-dot-dot) passes extra arguments through to the function being called. This makes your function flexible without listing every possible argument.
Exercise 7: Memoized Function
Write a function that computes the nth Fibonacci number using memoization (caching previous results to avoid recomputation).
# Exercise 7: Fibonacci with memoization
# Naive recursion is O(2^n) -- extremely slow for large n
# Memoization makes it O(n)
Click to reveal solution
# Create a memoized Fibonacci using an environment as cache
make_fibonacci <- function() {
cache <- new.env(parent = emptyenv())
fib <- function(n) {
key <- as.character(n)
if (exists(key, envir = cache)) {
return(get(key, envir = cache))
}
result <- if (n <= 1) n else fib(n - 1) + fib(n - 2)
assign(key, result, envir = cache)
return(result)
}
return(fib)
}
fibonacci <- make_fibonacci()
# Test -- fast even for large n
cat("fib(10):", fibonacci(10), "\n")
cat("fib(20):", fibonacci(20), "\n")
cat("fib(30):", fibonacci(30), "\n")
# First 15 Fibonacci numbers
fibs <- sapply(1:15, fibonacci)
cat("Sequence:", fibs, "\n")
Key concept: This uses a closure — a function that carries its own environment (cache). The environment persists between calls, storing previously computed values. This is a function factory pattern.
Hard (8-10): Design Challenges
Exercise 8: Pipeable Data Transformer
Write a function standardize(df, cols) that standardizes (z-score normalizes) specified columns of a data frame. It should work in a dplyr pipe.
# Exercise 8: Standardize columns (z-score)
# standardize(mtcars, c("mpg", "hp")) -> same df with mpg and hp z-scored
Key concept: The function takes a data frame as first argument and returns a modified data frame — making it pipe-compatible. This is the standard pattern for dplyr-friendly functions.
Exercise 9: Function Factory
Write a make_power(n) factory that returns a function that raises its input to the nth power. Create square, cube, and fourth-power functions from it.
make_power <- function(n) {
force(n) # Ensure n is evaluated now, not later
function(x) x^n
}
# Create specialized functions
square <- make_power(2)
cube <- make_power(3)
fourth <- make_power(4)
sqrt_func <- make_power(0.5)
cat("square(5):", square(5), "\n")
cat("cube(3):", cube(3), "\n")
cat("fourth(2):", fourth(2), "\n")
cat("sqrt(16):", sqrt_func(16), "\n")
# They work on vectors!
x <- 1:5
cat("\nSquares of 1:5:", square(x), "\n")
cat("Cubes of 1:5:", cube(x), "\n")
# Compose them
cat("\nsquare(cube(2)):", square(cube(2)), "\n") # (2^3)^2 = 64
Key concept:make_power(n) returns a function that "remembers" n via closure. force(n) ensures n is captured immediately (avoiding a subtle lazy evaluation bug). This is R's function factory pattern.
Exercise 10: Complete Data Pipeline Function
Write a function analyze_by_group(df, group_col, value_col) that: groups a data frame, computes statistics per group, identifies the top group, and returns a structured report.
analyze_by_group <- function(df, group_col, value_col) {
# Validate
if (!group_col %in% names(df)) stop(paste("Column not found:", group_col))
if (!value_col %in% names(df)) stop(paste("Column not found:", value_col))
if (!is.numeric(df[[value_col]])) stop(paste(value_col, "must be numeric"))
# Group statistics
groups <- split(df[[value_col]], df[[group_col]])
stats <- data.frame(
group = names(groups),
n = sapply(groups, length),
mean = round(sapply(groups, mean, na.rm = TRUE), 2),
median = round(sapply(groups, median, na.rm = TRUE), 2),
sd = round(sapply(groups, sd, na.rm = TRUE), 2),
min = sapply(groups, min, na.rm = TRUE),
max = sapply(groups, max, na.rm = TRUE),
row.names = NULL
)
stats <- stats[order(-stats$mean), ]
# Report
cat(sprintf("=== %s by %s ===\n\n", value_col, group_col))
print(stats)
best <- stats$group[1]
worst <- stats$group[nrow(stats)]
cat(sprintf("\nHighest mean %s: %s group (%.2f)\n",
value_col, best, stats$mean[1]))
cat(sprintf("Lowest mean %s: %s group (%.2f)\n",
value_col, worst, stats$mean[nrow(stats)]))
# Return invisibly for further use
invisible(stats)
}
# Test with mtcars
result <- analyze_by_group(mtcars, "cyl", "mpg")
cat("\n--- Another analysis ---\n")
analyze_by_group(iris, "Species", "Petal.Length")
Key concept: The function validates inputs, computes statistics using split() + sapply(), formats output, and returns results invisibly. This is a production-quality analysis function pattern.