R Functions Exercises: 10 Problems, Write, Debug & Optimize Functions, Solved Step-by-Step
Ten exercises that take you from writing your first function to debugging, benchmarking and closures. Every problem is runnable here in the page, with an expandable worked solution. Work through in order, each one builds on the previous.
Functions are R's main unit of reuse. Every tidyverse verb is a function. Every statistical model fit is a function call. The surprising thing is how few programmers use the more advanced features, default arguments, ..., closures, early return(), even after years of R. These exercises fix that.
Section 1, Writing functions
Exercise 1. Your first function
Write a function bmi(weight_kg, height_m) that returns body mass index (weight / height²). Test it with 70 kg and 1.75 m.
Solution
A function is created with function(args) body. The last expression in the body is the return value, no explicit return() needed.
Exercise 2. Default arguments
Extend bmi() so height_m defaults to 1.70 when not supplied. Test with bmi(70) and bmi(70, 1.80).
Solution
Default values can reference other arguments or be computed expressions. They are evaluated lazily, only when the argument is actually used.
Exercise 3. Multiple return values via a list
Write summary_stats(x) that takes a numeric vector and returns a named list with n, mean, sd, min, and max. Test with c(2, 5, 7, 10, 14).
Solution
R functions return exactly one object. To "return multiple values", return a list (or a named vector, or a data frame).
Section 2, Arguments and matching
Exercise 4. Partial matching and named args
Write greet(name, greeting = "Hello", punctuation = "!"). Then call it three different ways: positional only, all named, and with partial-name matching (greet("Ada", punc = "?")).
Solution
R allows partial argument matching, punc resolves to punctuation because no other argument starts with those letters. For production code, avoid partial matching: it is fragile if you add another argument later.
Exercise 5. Variadic with ...
Write paste_upper(...) that takes any number of character arguments and returns their concatenation in uppercase. Example: paste_upper("hello", "world") returns "HELLOWORLD".
Solution
... collects all unnamed arguments and forwards them to another function. This is how paste(), c(), and most R functions accept variable-length inputs.
Section 3, Environments and closures
Exercise 6. A counter closure
Write make_counter(start = 0) that returns a function. Each time the returned function is called, it increments an internal counter and returns the new value.
Solution
count <<- count + 1 assigns into the enclosing environment, which is where count lives. Each call to make_counter() creates a fresh environment, so counters are independent. This is the core of R's closure pattern.
Exercise 7. Memoization with a closure
Write memoize(f) that takes a function and returns a new function. The returned function caches results per unique argument so that calling it twice with the same input is instant the second time.
Solution
The cache lives in the enclosing environment of the returned function. It persists across calls but is invisible to outside code, exactly what you want for a cache.
Section 4, Debugging and safety
Exercise 8. Input validation with stop()
Write safe_bmi(weight_kg, height_m) that errors with an informative message if either argument is not a single positive number. Test it with safe_bmi(70, 1.75) (valid) and safe_bmi(-10, 1.75) (invalid).
Solution
The pattern is: validate at the top, fail fast with a clear message, then do the work. stopifnot() is a shorter alternative when the default error messages are good enough.
Exercise 9. Early return with tryCatch
Write safe_log(x) that returns log(x) when x > 0, NA_real_ when x <= 0 or NA, and NA_real_ if anything else goes wrong, all without letting an error propagate.
Solution
tryCatch() lets you convert errors and warnings into values. Use it sparingly, it can hide real bugs. Here it is the right choice because the caller explicitly wants a total function.
Section 5, Benchmarking
Exercise 10. Measure two implementations
Write two implementations of "sum of squares from 1 to n": one using a for loop, one using sum((1:n)^2). Benchmark them at n = 100000 with system.time(). Which is faster?
Solution
The vectorised version dispatches to compiled C code in one call. The loop runs the R interpreter once per iteration. Always reach for vectorised first, use loops only when each step genuinely needs the previous result.
Summary
- Functions are created with
function(args) body. The last expression is returned. - Use defaults (
x = 1) for optional arguments and...for variadic forwarding. - Closures capture the enclosing environment, the basis for counters, memoization, and
$accessors in Shiny. - Validate inputs with
stop()orstopifnot(). Wrap risky code intryCatch()only when errors are expected. - Benchmark before optimising. Vectorised code usually beats explicit loops by orders of magnitude.