purrr pmap() in R: Iterate Over Many Lists in Parallel
The pmap() function in purrr applies a function to any number of lists, vectors, or data frame columns in parallel, stepping through all of them at once. Type-safe variants (pmap_dbl, pmap_chr, pmap_dfr) return a specific atomic type or a data frame instead of a list.
pmap(list(a, b, c), ~ ..1 + ..2 + ..3) # list output pmap_dbl(list(a, b), ~ ..1 * ..2) # numeric vector pmap_chr(df, paste) # character vector pmap(df, function(x, y) x + y) # named args from columns pmap_dfr(list(a, b), make_row) # row-bind to data frame pmap(list(a, b), fn, na.rm = TRUE) # extra args after .f pwalk(list(paths, data), write.csv) # side effects, no return
Need explanation? Read on for examples and pitfalls.
What pmap() does in one sentence
pmap(.l, .f) calls .f once per position, drawing one element from every list inside .l. Element i of the result is .f applied to the i-th element of each input, so all inputs must have the same length.
While map() walks one input and map2() walks two, pmap() walks any number of inputs in lockstep. You pass them as a single list, which is why pmap() has no fixed limit on input count. A data frame is itself a list of equal-length columns, so pmap() over a data frame iterates row by row.
Syntax
pmap(.l, .f, ...). .l is a list of inputs, .f is the function or lambda, and ... holds extra arguments passed to .f.
Inside a purrr lambda, refer to inputs positionally as ..1, ..2, ..3, and so on. If .l has named elements, you can instead write a function whose argument names match those names. Every list element in .l must have the same length.
pmap_*() variant whenever you know the output type. pmap_dbl() returns a numeric vector and errors if any call returns a non-numeric value, which catches bugs early. Use plain pmap() only when results are mixed types or genuinely need to stay a list.Five common patterns
1. Plain pmap (list output)
pmap() with no suffix always returns a list, one element per position. Use it when each call produces something that does not flatten cleanly, such as a vector or a model object.
Each call returns a length-3 vector, so a list is the only sensible container.
2. Type-safe numeric output
pmap_dbl() combines several numeric inputs into one clean numeric vector. Declaring the output type up front turns a wrong return value into an immediate error instead of a silent list.
..1 is each base value, ..2 each bonus, and ..3 each weight, so element i is (base[i] + bonus[i]) * weight[i].
3. Iterate over data frame rows
Pass a data frame as .l and pmap() walks it one row at a time. Each column becomes an argument, matched by name to your function's parameters.
The argument names amount and rate match the column names, so purrr passes each row's values into the right slots.
pmap() over a data frame is row-wise iteration for free. This is the cleanest base-tidyverse answer to "apply a function to every row." No apply(), no rowwise(), no manual indexing: name your function's arguments after the columns and let purrr do the matching.4. Build strings from several columns
pmap_chr() glues one value from each input into a single string per position. It is the multi-input version of pasting columns together.
5. Combine results into a data frame
pmap_dfr() calls a function that returns a data frame, then row-binds every result. This builds one tidy table from several parallel inputs.
pmap() vs map2() vs mapply()
Three families iterate over several inputs in parallel, with different input limits and output guarantees. pmap() is the only one with no cap on input count.
| Function | Inputs | Package | Output |
|---|---|---|---|
map2() |
Exactly 2 | purrr | List |
pmap() |
Any number (a list) | purrr | List |
pmap_dbl() and friends |
Any number | purrr | Type-strict atomic vector |
mapply() and Map() |
Any number | base | Vector or matrix or list (auto) |
Use map2() when you have exactly two inputs and pmap() once you reach three or more, or whenever the inputs already live together in a list or data frame. The typed pmap_*() family adds the safety net: you declare the output type, and a wrong return value raises an error instead of simplifying unpredictably the way mapply() does.
pmap(list(a, b, c), f) is [f(x, y, z) for x, y, z in zip(a, b, c)] or list(map(f, a, b, c)). Iterating a data frame with pmap(df, f) is like pandas df.apply(f, axis=1).Common pitfalls
Pitfall 1: forgetting to wrap inputs in list(). pmap() takes a single argument .l. Writing pmap(a, b, c, fn) is wrong because b and c get treated as .f and .... Always group the inputs: pmap(list(a, b, c), fn).
Pitfall 2: a data frame with extra columns. When you pass a data frame, every column becomes an argument. If the data frame has more columns than your function accepts, the call errors. Select the needed columns first, or add ... to the function signature to absorb the rest.
Pitfall 3: reaching for .x and .y. Inside a pmap() lambda there is no .x or .y. Use the numbered pronouns ..1, ..2, ..3, or name the function arguments to match the list element names.
Try it yourself
Try it: Use pmap_dbl to iterate over a data frame. Build mtcars[1:4, c("hp", "wt")] and compute hp / wt for each row. Save the result to ex_ratio.
Click to reveal solution
Explanation: pmap_dbl() walks the data frame one row at a time. The function arguments hp and wt are matched to the columns by name, and the _dbl suffix returns a numeric vector instead of a list.
Related purrr functions
After pmap, these functions cover the rest of multi-input iteration:
pmap_dbl(),pmap_chr(),pmap_lgl(),pmap_int(): type-safe variantspmap_dfr(),pmap_dfc(): combine results into a data frame by row or columnmap()andmap2(): iterate over one or exactly two inputspwalk(): run a multi-input function for side effects, returning the input invisiblyimap(): iterate over one input plus its index or names
The base R counterparts are Map() and mapply() for projects that avoid the tidyverse. The official argument reference lives in the purrr pmap documentation.
FAQ
What is the difference between map2 and pmap in purrr?
map2() iterates over exactly two inputs, exposed inside a lambda as .x and .y. pmap() iterates over any number of inputs, passed as a single list and referred to as ..1, ..2, ..3, and so on. Use map2() for two inputs and pmap() once you reach three or more, or whenever the inputs already sit together in a list or data frame.
How do I use pmap with a data frame?
Pass the data frame directly as the first argument: pmap(df, fn). A data frame is a list of equal-length columns, so pmap() walks it one row at a time. Give your function argument names that match the column names, and purrr matches each row's values to the right parameters automatically.
What do ..1 and ..2 mean in pmap?
..1, ..2, and ..3 are positional pronouns inside a pmap() lambda. ..1 is the current element of the first list in .l, ..2 the second, and so on. They let you write a compact formula such as ~ ..1 + ..2 without naming a full function. For named lists, you can use the names instead.
Can pmap return a data frame?
Yes. Use pmap_dfr() when your function returns a data frame per call; it row-binds every result into one table. Use pmap_dfc() to column-bind instead. Both require the dplyr or vctrs binding rules, so each per-call result should have a consistent set of columns.
How do I run pmap for side effects only?
Use pwalk() instead of pmap(). It calls the function once per position for its side effect, such as writing a file or printing, then returns the input list invisibly. pwalk(list(paths, datasets), write.csv) writes each dataset to its matching path without building a result list.