purrr map() in R: Every Variant Explained With the Mental Model That Makes Them Click

purrr's map family replaces for loops with composable one-liners, map() handles one input, map2() pairs two, pmap() scales to any number, and _dbl/_chr/_lgl suffixes guarantee the output type you expect.

By Selva Prabhakaran · Published May 11, 2026 · Last updated May 11, 2026

There are roughly 30 functions in the map family, but you don't need to memorise them. Once you see the two-dimensional grid behind the names, how many inputs on one axis, what output type on the other, every variant becomes obvious. This tutorial walks every useful member of the family with runnable examples so you leave with a working mental model, not a cheat-sheet.

What does map() actually do, and why replace for loops?

If you've ever written a for loop just to build up a list of results, map() is the cleaner replacement. It takes a vector, applies a function to every element, and collects the answers into a list, in one line, with no counter variable and no pre-allocation. Here's the side-by-side: compute the mean of every column in mtcars, first the long way, then the map() way.

RColumn means via loop and map

library(purrr) library(dplyr) # The hand-rolled for loop col_means_loop <- vector("list", length(mtcars)) names(col_means_loop) <- names(mtcars) for (col in names(mtcars)) { col_means_loop[[col]] <- mean(mtcars[[col]]) } col_means_loop$mpg #> [1] 20.09062 # The map() version, same result, one line col_means_list <- map(mtcars, mean) col_means_list$mpg #> [1] 20.09062

Both expressions produce the same named list of column means, but map() removes every piece of loop ceremony. You don't pre-allocate the container, you don't track an index, and you don't copy the result into a slot, map() returns the assembled list as its value, so you can pipe it straight into the next step.

From a hand-rolled for loop to map(), map_dbl(), and walk(), same iteration pattern, different return types.

Figure 1: The same iteration pattern, apply a function to every element, expressed three ways. The return type changes; the logic doesn't.

The second argument to map() is a function. You can pass a named function like mean, or inline a tiny one with R 4.1's backslash-lambda \(x) ..., or with purrr's formula shorthand ~ .x * 2 (where .x is the current element).

RThree lambda styles for map

# Three ways to write "double every value in 1:5" doubled_named <- map(1:5, function(x) x * 2) doubled_lambda <- map(1:5, \(x) x * 2) doubled_formula <- map(1:5, ~ .x * 2) identical(doubled_named, doubled_lambda) #> [1] TRUE identical(doubled_lambda, doubled_formula) #> [1] TRUE

All three return the same 5-element list. Pick whichever reads best: use a named function when the logic has a name worth keeping, the \(x) lambda for anything longer than a single expression, and the ~ .x formula for tiny one-liners.

Tip

Prefer the backslash lambda for anything beyond a one-liner. The \(x) syntax is standard R 4.1+, supports multiple arguments cleanly, and readers unfamiliar with purrr's .x placeholder can still read it. Save the ~ formula shorthand for very short expressions.

Try it: Use map() to square every element of 1:5. The result should be a list of 5 numbers.

RExercise: square each element with map

# Try it: square each element of 1:5 using map() ex_squares <- map(1:5, function(x) { # your code here }) ex_squares #> Expected: list(1, 4, 9, 16, 25)

Click to reveal solution

RSquare each element solution

ex_squares <- map(1:5, \(x) x ^ 2) ex_squares #> [[1]] #> [1] 1 #> #> [[2]] #> [1] 4 #> #> [[3]] #> [1] 9 #> #> [[4]] #> [1] 16 #> #> [[5]] #> [1] 25

Explanation: map() applies \(x) x ^ 2 to each element and wraps the five results in a list. If you want a plain numeric vector instead of a list, use map_dbl(), that's the next section.

How do the map_*() type suffixes guarantee the output you expect?

map() always returns a list. Most of the time you actually want a plain atomic vector, a numeric, a character, or a logical. That's what the type-suffix variants are for. map_dbl() returns a double vector, map_int() an integer, map_chr() a character, map_lgl() a logical. They do the same iteration as map(), but they unwrap the result and check that every piece matches the promised type.

Rmap double returns numeric vector

# map_dbl returns a named numeric vector col_means <- map_dbl(mtcars, mean) col_means #> mpg cyl disp hp drat wt qsec #> 20.090625 6.187500 230.721875 146.687500 3.596563 3.217250 17.848750 #> vs am gear carb #> 0.437500 0.406250 3.687500 2.812500

Compare that to the list you got in the last section, same numbers, but now you can drop the result straight into sort(), plot(), or arithmetic without unlisting first. The named-vector format also makes col_means["mpg"] work cleanly.

Rmap character for formatted labels

# map_chr returns a character vector, great for formatted labels rounded_means <- map_chr(mtcars, \(x) sprintf("%.1f", mean(x))) rounded_means #> mpg cyl disp hp drat wt qsec vs am #> "20.1" "6.2" "230.7" "146.7" "3.6" "3.2" "17.8" "0.4" "0.4" #> gear carb #> "3.7" "2.8"

map_chr() ran sprintf() on each column's mean and collected the 11 strings into a single named character vector, ready for a plot legend, a report heading, or a paste0() concatenation.

The purrr map family arranged by input arity (rows) and output type (columns).

Figure 2: The map family is a 2D grid. Pick a row (how many inputs) and a column (output type) and the function name writes itself.

The suffix isn't cosmetic, it's a promise the function enforces. If your function returns something that isn't the promised type, map_dbl() errors loudly rather than silently returning garbage.

Rmap double fails fast on bad types

# map_dbl fails fast when types don't match result <- tryCatch( map_dbl(list(1, "two", 3), identity), error = function(e) conditionMessage(e) ) result #> [1] "Can't coerce from a string to a double vector."

Instead of quietly coercing "two" into NA or a number, map_dbl() stops and tells you exactly which element broke the contract. That's a feature: an explicit failure beats a silent wrong answer every time.

Warning

The typed map variants are strict by design. If your function returns values of mixed or unpredictable type, use plain map() and convert afterwards. Don't wrap map_dbl() in tryCatch() to paper over type mismatches, fix the upstream function instead.

Key Insight

Pick the suffix that matches your known output, not the loosest one that works. Using map() everywhere because "it always works" defeats the point, the suffixes exist so type errors surface at the iteration site, not three functions downstream where they're hard to debug.

Try it: Use map_int() to return the number of characters in each element of c("dog", "horse", "bee"). The answer should be an integer vector of length 3.

RExercise: string lengths with map integer

# Try it: string lengths with map_int ex_words <- c("dog", "horse", "bee") ex_lengths <- map_int(ex_words, function(w) { # your code here }) ex_lengths #> Expected: c(3L, 5L, 3L)

Click to reveal solution

RString lengths solution

ex_words <- c("dog", "horse", "bee") ex_lengths <- map_int(ex_words, \(w) nchar(w)) ex_lengths #> [1] 3 5 3

Explanation: nchar() returns an integer for each string, and map_int() collects the three answers into an integer vector. You could also write this as map_int(ex_words, nchar), when the function is a one-argument named function, you can drop the lambda entirely.

When do you need map2() to iterate over two inputs in parallel?

map() works beautifully when you're iterating over one vector. But plenty of problems pair two vectors, sample sizes with seeds, means with standard deviations, column names with column values. map2() is the version that walks two inputs in lockstep, feeding the i-th element of each to your function on every step.

Rmap two simulates paired specs

# Simulate 4 samples, each with its own mean and sd set.seed(101) means <- c(0, 5, 10, 20) sds <- c(1, 2, 1, 5) sim_samples <- map2(means, sds, \(m, s) rnorm(n = 5, mean = m, sd = s)) sim_samples[[1]] #> [1] -0.3260365 0.5524619 -0.6749438 0.2143595 0.3107692 sim_samples[[4]] #> [1] 14.73944 22.18163 18.81113 17.92391 18.38659

Each call to rnorm() uses the matching element from both vectors, the first call gets mean = 0, sd = 1, the second gets mean = 5, sd = 2, and so on. The result is a length-4 list where each slot holds a 5-number sample from a different normal distribution.

Inside the lambda, you can name the arguments anything (m and s here) or use purrr's formula shorthand where .x is the first input and .y is the second.

RElementwise product with map two double

# Elementwise product of two numeric vectors, return a plain numeric elem_prod <- map2_dbl(c(1, 2, 3, 4), c(10, 20, 30, 40), \(a, b) a * b) elem_prod #> [1] 10 40 90 160

map2_dbl() works exactly like map2() but promises a double vector output, the same contract as map_dbl(), extended to two inputs. Every type suffix from the previous section has a map2_ cousin.

Note

For simple elementwise arithmetic, plain R vectorisation is faster. c(1,2,3,4) * c(10,20,30,40) returns the same answer without purrr. Reach for map2() when the per-element operation is a function call that isn't already vectorised, random draws, model fits, custom transformations.

Try it: Multiply c(2, 4, 6) by c(10, 100, 1000) elementwise and return the result as a double vector.

RExercise: elementwise multiply with map two

# Try it: elementwise multiply with map2_dbl ex_a <- c(2, 4, 6) ex_b <- c(10, 100, 1000) ex_prod <- map2_dbl(ex_a, ex_b, function(x, y) { # your code here }) ex_prod #> Expected: c(20, 400, 6000)

Click to reveal solution

RElementwise multiply solution

ex_a <- c(2, 4, 6) ex_b <- c(10, 100, 1000) ex_prod <- map2_dbl(ex_a, ex_b, \(x, y) x * y) ex_prod #> [1] 20 400 6000

Explanation: map2_dbl() pairs ex_a[i] with ex_b[i] for each i, multiplies them, and collects the three products into a double vector. The defining feature of every map2_* variant is that the function takes two arguments instead of one.

How does pmap() scale iteration to any number of arguments?

There's no map3() or map4(), because pmap() generalises the whole idea. Instead of accepting 2, 3, or 4 separate vectors, pmap() takes one list whose elements are the vectors you want to iterate over in parallel. Three inputs, ten inputs, same syntax.

The cleanest pattern is to name the list elements to match your function's argument names. purrr will wire them up for you automatically.

Rpmap over three paired vectors

# Three paired vectors → named list → pmap set.seed(202) sim_list <- pmap( list(n = c(3, 4, 5), mean = c(0, 10, 20), sd = c(1, 2, 3)), rnorm ) sim_list #> [[1]] #> [1] -0.4304691 0.2572641 1.0844412 #> #> [[2]] #> [1] 8.970506 7.826228 9.977303 10.812235 #> #> [[3]] #> [1] 18.20688 24.50030 18.71374 21.45382 19.94457

The first rnorm() call got n = 3, mean = 0, sd = 1; the second got n = 4, mean = 10, sd = 2; the third got n = 5, mean = 20, sd = 3. Because the list names match rnorm's argument names, you didn't need a lambda at all, pmap passed them through directly.

Tip

Name your pmap input list to match the target function's arguments. It eliminates lambdas, makes the code self-documenting, and lets you reorder inputs safely. If your list elements aren't named, pmap falls back to positional matching (first list element → first function argument), which is more fragile.

A really powerful variant: since a tibble is just a named list of equal-length vectors, you can pass a whole tibble of experiment specifications straight to pmap().

Rpmap on a tibble of specs

library(tibble) # A tibble where each row is one experiment spec_tbl <- tibble( n = c(10, 20, 50), mean = c(0, 5, 10), sd = c(1, 2, 3) ) set.seed(303) mean_results <- pmap_dbl(spec_tbl, \(n, mean, sd) mean(rnorm(n, mean, sd))) mean_results #> [1] 0.2194892 5.3821048 9.9112164

Each row of spec_tbl became one call to the lambda; the lambda drew n random values from a normal with the row's mean and sd, then returned their observed mean. pmap_dbl() collected the three observed means into a numeric vector, and because it's _dbl, you get a flat atomic vector instead of a list of doubles.

If you don't want to name arguments, purrr's formula shorthand supports positional placeholders ..1, ..2, ..3 for any number of inputs.

Rpmap character with positional shorthand

# Positional shorthand, fine for quick work, noisier to read paste_out <- pmap_chr( list(c("A", "B", "C"), c(1, 2, 3), c("x", "y", "z")), ~ paste0(..1, ..2, ..3) ) paste_out #> [1] "A1x" "B2y" "C3z"

..1 is the first list element, ..2 the second, and so on. It works, but the named-argument style from earlier is usually easier to read once you have more than two inputs.

Try it: Use pmap_chr() to build sentences of the form "<name> scored <score> on the <subject> test" from three equal-length vectors.

RExercise: build sentences with pmap character

# Try it: build sentences with pmap_chr ex_names <- c("Ada", "Babbage", "Curie") ex_scores <- c(92, 88, 97) ex_subjects <- c("math", "engineering", "chemistry") ex_sentences <- pmap_chr( list(name = ex_names, score = ex_scores, subject = ex_subjects), function(name, score, subject) { # your code here } ) ex_sentences #> Expected: "Ada scored 92 on the math test", etc.

Click to reveal solution

RBuild sentences solution

ex_names <- c("Ada", "Babbage", "Curie") ex_scores <- c(92, 88, 97) ex_subjects <- c("math", "engineering", "chemistry") ex_sentences <- pmap_chr( list(name = ex_names, score = ex_scores, subject = ex_subjects), \(name, score, subject) paste0(name, " scored ", score, " on the ", subject, " test") ) ex_sentences #> [1] "Ada scored 92 on the math test" #> [2] "Babbage scored 88 on the engineering test" #> [3] "Curie scored 97 on the chemistry test"

Explanation: The named list wires name, score, and subject into the lambda by argument name, and pmap_chr() collects the three formatted strings into a character vector. Using a named list is much clearer than ..1/..2/..3 once you have more than two inputs.

What is imap() for, and why use it instead of manual indices?

Sometimes the function you're applying needs to know where each element came from, its position, its name, or both. You could do that with map2() by passing seq_along(x) as the second input, but imap() does it for you. It's exactly equivalent to map2(.x, names(.x), .f) when the input has names, and map2(.x, seq_along(.x), .f) when it doesn't.

Rimap over a named list

# Named input → .y is the name populations <- list(tokyo = 37.4, delhi = 32.9, shanghai = 28.5) kv_strings <- imap_chr(populations, \(val, key) paste0(key, ": ", val, "M")) kv_strings #> tokyo delhi shanghai #> "tokyo: 37.4M" "delhi: 32.9M" "shanghai: 28.5M"

The lambda received two arguments: val (the list element) and key (the name). imap_chr() pasted them together and returned a named character vector. You didn't have to extract names(populations) or track an index counter, imap did it for you.

If the input has no names, imap() uses the integer position instead.

Rimap on unnamed vector uses index

# Unnamed input → .y is the integer index idx_strings <- imap_chr(c("red", "green", "blue"), \(val, idx) paste0(idx, "=", val)) idx_strings #> [1] "1=red" "2=green" "3=blue"

Same pattern, different second argument. imap() silently switches between "use names" and "use indices" depending on whether the input is named, so your code reads the same whether you're looping a named list or a plain vector.

Key Insight

imap() is the R equivalent of Python's enumerate(). Any time you find yourself writing for (i in seq_along(x)) to get both the element and its position, reach for imap instead. It's one function call, it respects names when they exist, and it plugs straight into a tidyverse pipeline.

Try it: Given a named numeric vector, build "city (value)" labels with imap_chr().

RExercise: label named vector with imap

# Try it: label a named vector with imap_chr ex_cities <- c(paris = 11, london = 9, berlin = 4) ex_labels <- imap_chr(ex_cities, function(val, key) { # your code here }) ex_labels #> Expected: "paris (11)" "london (9)" "berlin (4)"

Click to reveal solution

RLabel named vector solution

ex_cities <- c(paris = 11, london = 9, berlin = 4) ex_labels <- imap_chr(ex_cities, \(val, key) paste0(key, " (", val, ")")) ex_labels #> paris london berlin #> "paris (11)" "london (9)" "berlin (4)"

Explanation: imap_chr() hands the lambda both the value and its name on each iteration, and paste0() builds the label. This is the cleanest way to build "label: value" strings from a named vector.

When should you use walk() instead of map()?

Sometimes you iterate purely for a side effect, printing, saving a plot, writing a file, logging a message, and you don't care about the return value. Using map() for that works, but it allocates a list of NULLs you'll throw away and it prints that list if you run it at the console. walk() is the "for its side effects" variant: it calls the function on every element, ignores the return values, and returns the input invisibly so pipelines keep flowing.

Rwalk for per cylinder summary

# Print a per-cylinder summary of mpg, side effect only mtcars_by_cyl <- split(mtcars, mtcars$cyl) walk(mtcars_by_cyl, \(df) { cat("cyl =", unique(df$cyl), ", mean mpg =", round(mean(df$mpg), 1), "\n") }) #> cyl = 4, mean mpg = 26.7 #> cyl = 6, mean mpg = 19.7 #> cyl = 8, mean mpg = 15.1

Three lines of output, no list of NULLs cluttering your console. walk() evaluated the lambda for its printing effect, discarded the return values, and invisibly returned mtcars_by_cyl, so you could even pipe the result into another step if you wanted to.

Like map2() and pmap(), walk() has walk2() and pwalk() siblings for two or n inputs.

Rwalk two over filenames and frames

# Simulate "write report", in WebR we print instead of writing files report_names <- c("report_4cyl.txt", "report_6cyl.txt", "report_8cyl.txt") walk2(report_names, mtcars_by_cyl, \(fname, df) { cat("--- Would write:", fname, "---\n") cat("Rows:", nrow(df), "| Mean mpg:", round(mean(df$mpg), 1), "\n\n") }) #> --- Would write: report_4cyl.txt --- #> Rows: 11 | Mean mpg: 26.7 #> #> --- Would write: report_6cyl.txt --- #> Rows: 7 | Mean mpg: 19.7 #> #> --- Would write: report_8cyl.txt --- #> Rows: 14 | Mean mpg: 15.1

In a real R session you'd call write.csv(df, fname) or ggsave(fname, plot) inside the lambda. Here we print what would happen so you can see the pairing, each filename lines up with its matching data frame, exactly as map2() would.

Note

This page runs R in a browser sandbox with an in-memory virtual filesystem. Functions like write.csv() technically execute but the files vanish on page reload and there's no "Downloads" folder to find them in. That's why the example above prints instead of writing. In your local RStudio, walk2(filenames, data_list, write.csv) is the real thing.

Try it: Use walk() to print each greeting in a list with a — prefix on its own line.

RExercise: walk for printing greetings

# Try it: walk() for printing side effects ex_greetings <- list("Hello, Ada", "Hola, Babbage", "Bonjour, Curie") walk(ex_greetings, function(g) { # your code here }) #> Expected: #>, Hello, Ada #>, Hola, Babbage #>, Bonjour, Curie

Click to reveal solution

Rwalk printing greetings solution

ex_greetings <- list("Hello, Ada", "Hola, Babbage", "Bonjour, Curie") walk(ex_greetings, \(g) cat(", ", g, "\n")) #>, Hello, Ada #>, Hola, Babbage #>, Bonjour, Curie

Explanation: cat() prints to the console and returns NULL, exactly the side-effect-only pattern walk() is designed for. Using map() here would work but would also print a useless list of three NULLs below the greetings.

Practice Exercises

These capstones combine multiple variants from the tutorial. Each is solvable with concepts you've already seen. Use distinct variable names (prefixed my_) so exercises don't overwrite tutorial state.

Exercise 1: Summary report from a list of data frames

You have a list of three small data frames. Use map_int() to compute each row count, then imap_chr() to build a one-line summary string per data frame, then walk() to print each summary. The final printed output should have three lines.

RExercise: combine map integer imap walk

# Exercise 1: combine map_int + imap_chr + walk my_dfs <- list( a = data.frame(x = 1:5), b = data.frame(x = 1:10, y = 1:10), c = data.frame(x = 1:3) ) # Hint: 1) map_int() for row counts # 2) imap_chr() for "df <name> has N rows" strings # 3) walk() to print each # Write your code below:

Click to reveal solution

RRow count summaries solution

my_dfs <- list( a = data.frame(x = 1:5), b = data.frame(x = 1:10, y = 1:10), c = data.frame(x = 1:3) ) my_rowcounts <- map_int(my_dfs, nrow) my_summaries <- imap_chr(my_rowcounts, \(n, name) paste0("df ", name, " has ", n, " rows")) walk(my_summaries, \(line) cat(line, "\n")) #> df a has 5 rows #> df b has 10 rows #> df c has 3 rows

Explanation: map_int(my_dfs, nrow) returns a named integer vector of row counts. imap_chr() then turns each count into a sentence using the name of the data frame. Finally walk() prints each sentence as a side effect. You could chain these with |> if you prefer a single expression.

Exercise 2: Monte Carlo experiment grid with pmap()

You have a tibble of experiment specifications, six combinations of sample size, distribution mean, and standard deviation. For each row, draw a random sample from a normal distribution, compute its observed mean, and return all six observed means alongside the original specs.

RExercise: pmap for Monte Carlo grid

# Exercise 2: pmap + mutate for a Monte Carlo grid library(dplyr) my_spec <- tibble( n = c(10, 10, 10, 100, 100, 100), mean = c(0, 5, 10, 0, 5, 10), sd = c(1, 1, 1, 1, 1, 1) ) set.seed(2026) # Hint: inside mutate(), use pmap_dbl() to walk n/mean/sd in parallel # and return the observed mean of rnorm(n, mean, sd) # Write your code below:

Click to reveal solution

RMonte Carlo grid solution

library(dplyr) my_spec <- tibble( n = c(10, 10, 10, 100, 100, 100), mean = c(0, 5, 10, 0, 5, 10), sd = c(1, 1, 1, 1, 1, 1) ) set.seed(2026) my_results <- my_spec |> mutate(observed_mean = pmap_dbl( list(n, mean, sd), \(n, mean, sd) mean(rnorm(n, mean, sd)) )) my_results #> # A tibble: 6 × 4 #> n mean sd observed_mean #> <dbl> <dbl> <dbl> <dbl> #> 1 10 0 1 0.172 #> 2 10 5 1 4.85 #> 3 10 10 1 10.1 #> 4 100 0 1 -0.0551 #> 5 100 5 1 5.08 #> 6 100 10 1 9.99

Explanation: Inside mutate(), pmap_dbl() walks the three column vectors in parallel. On each iteration the lambda draws n values from rnorm(mean, sd) and returns the observed mean. With n = 100 rows the observed means are much closer to the true means, the classic law-of-large-numbers effect.

Exercise 3: Labelled output with imap() and walk2()

Given a list of model specifications (each a nested list with family and formula), build a labelled printout where each spec appears below a numbered heading like === Model 1: linear ===.

RExercise: imap and walk two printouts

# Exercise 3: imap + walk2 for labelled printout my_specs <- list( linear = list(family = "gaussian", formula = "y ~ x"), logistic = list(family = "binomial", formula = "y ~ x1 + x2"), poisson = list(family = "poisson", formula = "y ~ x1 + x2 + x3") ) # Hint: 1) imap_chr() to build the heading lines (use the list name + its position) # 2) walk2() to print each heading followed by the spec # Write your code below:

Click to reveal solution

RLabelled printout solution

my_specs <- list( linear = list(family = "gaussian", formula = "y ~ x"), logistic = list(family = "binomial", formula = "y ~ x1 + x2"), poisson = list(family = "poisson", formula = "y ~ x1 + x2 + x3") ) my_headings <- imap_chr(my_specs, \(spec, name) { paste0("=== Model ", match(name, names(my_specs)), ": ", name, " ===") }) walk2(my_headings, my_specs, \(heading, spec) { cat(heading, "\n") cat(" family: ", spec$family, "\n") cat(" formula: ", spec$formula, "\n\n") }) #> === Model 1: linear === #> family: gaussian #> formula: y ~ x #> #> === Model 2: logistic === #> family: binomial #> formula: y ~ x1 + x2 #> #> === Model 3: poisson === #> family: poisson #> formula: y ~ x1 + x2 + x3

Explanation: imap_chr() builds one heading string per spec, using the list name as the model label. Then walk2() walks the headings and specs in parallel, printing each heading followed by the model's family and formula. This is the pattern you'd use to build progress logs or formatted reports.

Complete Example: a mini Monte Carlo study

Here's how the pieces fit together in a realistic workflow. The goal is to compare how well the sample mean recovers the true mean at three different sample sizes (n = 10, 50, 200) for two distributions (standard normal and exponential).

We'll use pmap() to iterate the experiment grid, map_dbl() to summarise each draw, and walk() to print a formatted per-distribution report.

REnd-to-end mini Monte Carlo study

library(dplyr) library(tibble) # Build a 6-row experiment grid: 3 sample sizes × 2 distributions experiments <- tibble( dist = rep(c("normal", "exponential"), each = 3), n = rep(c(10, 50, 200), times = 2), true_mean = rep(c(0, 1), each = 3) # N(0,1) has mean 0; Exp(1) has mean 1 ) set.seed(2026) # Run every experiment with pmap(): one random sample per row experiments <- experiments |> mutate( sample = pmap(list(dist, n), \(dist, n) { if (dist == "normal") rnorm(n) else rexp(n, rate = 1) }), observed_mean = map_dbl(sample, mean), observed_sd = map_dbl(sample, sd), abs_error = abs(observed_mean - true_mean) ) # Show the results (drop the raw sample column for readability) experiments |> select(-sample) #> # A tibble: 6 × 5 #> dist n true_mean observed_mean observed_sd abs_error #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 normal 10 0 0.172 0.825 0.172 #> 2 normal 50 0 0.0311 1.01 0.0311 #> 3 normal 200 0 -0.0282 0.965 0.0282 #> 4 exponential 10 1 1.02 1.13 0.0247 #> 5 exponential 50 1 0.906 0.801 0.0943 #> 6 exponential 200 1 1.06 1.04 0.0605

Every column in the results tibble came from a different map variant. pmap() ran the experiments, map_dbl() extracted the observed mean and SD from each sample, and plain vectorised arithmetic handled abs_error. Notice the sample column, it's a list-column, one random sample per row, preserved for inspection. That's the tidyverse's native way of holding "one object per row."

Finally, print a per-distribution summary using walk() on a split of the tibble.

Rwalk split by distribution results

walk(split(experiments, experiments$dist), \(df) { cat("Distribution:", unique(df$dist), "\n") cat(" Mean absolute error by n:\n") for (i in seq_len(nrow(df))) { cat(" n =", df$n[i], ": abs_error =", round(df$abs_error[i], 4), "\n") } cat("\n") }) #> Distribution: exponential #> Mean absolute error by n: #> n = 10 : abs_error = 0.0247 #> n = 50 : abs_error = 0.0943 #> n = 200 : abs_error = 0.0605 #> #> Distribution: normal #> Mean absolute error by n: #> n = 10 : abs_error = 0.172 #> n = 50 : abs_error = 0.0311 #> n = 200 : abs_error = 0.0282

This is the kind of workflow purrr is built for. Each map variant has one job, pmap for the experiment grid, map_dbl for the scalar summaries, walk for the side-effect printing, and they all compose into a single readable pipeline.

Summary

Pick the variant by asking three questions: how many inputs?, do I need the index or name?, and do I want a return value or a side effect? The table below is the whole map family in one grid.

Pick the right map variant in three questions: how many inputs, need index/name, return value or side effect.

Figure 3: A three-question decision flow that narrows 30+ map functions down to one.

Variant	Inputs	Returns	Use when
`map()`	1	list	Output type varies, or you want a list
`map_dbl()`	1	double vector	Every call returns one double
`map_int()`	1	integer vector	Every call returns one integer
`map_chr()`	1	character vector	Every call returns one string
`map_lgl()`	1	logical vector	Every call returns `TRUE`/`FALSE`
`map2()` / `map2_*`	2	list or typed vector	Iterating two paired vectors
`pmap()` / `pmap_*`	n (list)	list or typed vector	3+ inputs, or a tibble of specs
`imap()` / `imap_*`	1 + index	list or typed vector	You need the name or position alongside the value
`walk()`	1	input (invisibly)	Side effects only, printing, saving, logging
`walk2()` / `pwalk()`	2 or n	input (invisibly)	Multi-input side effects

Key takeaways

map() is for one input; map2() pairs two; pmap() scales to any number.
The _dbl/_int/_chr/_lgl suffix turns the list output into a flat atomic vector, and enforces the type.
imap() is the R equivalent of "enumerate", use it whenever you'd reach for seq_along() or names() inside a manual loop.
walk() is for side effects: no list of NULLs, returns the input invisibly so pipelines keep flowing.
When in doubt, name your pmap() list elements to match the target function's argument names, it eliminates lambdas entirely.

References

Wickham, H., Advanced R, 2nd Edition. Chapter 9: Functionals. Link
Wickham, H. & Grolemund, G., R for Data Science, 2nd Edition. Chapter 27: Iteration. Link
purrr documentation, map() reference. Link
purrr documentation, map2() and pmap() reference. Link
purrr documentation, imap() reference. Link
Stanford DCL, Functional Programming with purrr, parallel iteration chapter. Link
Wickham, H., Advanced R, 1st Edition archive on functional style. Link

Continue Learning

Functional Programming in R, the broader paradigm that makes map feel natural: first-class functions, pure functions, and composition.
Writing R Functions, how to write clean functions for the .f argument you keep passing to every map variant.
dplyr Basics, the natural companion to purrr; mutate() + pmap() is the workflow for per-row computations on tibbles.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

purrr map() in R: Every Variant Explained With the Mental Model That Makes Them Click

What does map() actually do, and why replace for loops?

How do the map_*() type suffixes guarantee the output you expect?

When do you need map2() to iterate over two inputs in parallel?

How does pmap() scale iteration to any number of arguments?

What is imap() for, and why use it instead of manual indices?

When should you use walk() instead of map()?

Practice Exercises

Exercise 1: Summary report from a list of data frames

Exercise 2: Monte Carlo experiment grid with pmap()

Exercise 3: Labelled output with imap() and walk2()

Complete Example: a mini Monte Carlo study

Summary

References

Continue Learning

Further Reading

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

purrr map() in R: Every Variant Explained With the Mental Model That Makes Them Click

What does map() actually do, and why replace for loops?

How do the map_*() type suffixes guarantee the output you expect?

When do you need map2() to iterate over two inputs in parallel?

How does pmap() scale iteration to any number of arguments?

What is imap() for, and why use it instead of manual indices?

When should you use walk() instead of map()?

Practice Exercises

Exercise 1: Summary report from a list of data frames

Exercise 2: Monte Carlo experiment grid with pmap()

Exercise 3: Labelled output with imap() and walk2()

Complete Example: a mini Monte Carlo study

Summary

References

Continue Learning

Further Reading

Related Tutorials