purrr pwalk() in R: Side Effects Over Parallel Lists
purrr pwalk() in R applies a function to three or more parallel lists for their side effects, then returns the input list invisibly. It is the side-effect cousin of pmap().
pwalk(list(x, y, z), f) # 3+ parallel inputs pwalk(df, f) # iterate data frame rows pwalk(list(a, b), ~ cat(..1, ..2)) # formula shorthand pwalk(list(base = b, exp = e), fun) # named list, matched by arg name pwalk(args, f, sep = ", ") # extra args after .f walk2(x, y, f) # exactly two inputs
Need explanation? Read on for examples and pitfalls.
What pwalk() does in one sentence
pwalk() runs a function for its side effects across any number of parallel inputs. You hand it a list of equal-length vectors (or a data frame), and it calls the function once per position, drawing one value from each vector. Side effects mean printing, plotting, writing files, or logging: actions whose value is the action itself, not a returned object. Unlike pmap(), which collects results into a list, pwalk() throws the return values away and hands back the original input invisibly so it slots cleanly into a pipeline.
Syntax
The signature has two required arguments plus a passthrough. The p prefix stands for parallel, and walk marks it as a side-effect function.
.lis a list of vectors, or a data frame. Every element must be the same length. Each element supplies one argument to.f..fis the function to call. It accepts a plain function, an anonymous function, or a purrr formula using..1,..2,..3for the positional inputs....holds extra named arguments. They are passed unchanged to every call of.f.
The length of .l decides how many arguments .f receives. A list of three vectors calls a three-argument function. If .l is named, the names are matched to the function's argument names, so order does not matter.
pwalk(df, f) iterates over rows for free: each column becomes one argument, and each row supplies one set of values.Common use cases
The four patterns below cover most real uses of pwalk(). Each runs on built-in data so you can execute it as is.
1. Iterate over data frame rows
Pass a data frame straight to pwalk() to walk its rows. Column names line up with the function's argument names.
The function ran once per row and printed a formatted line. Nothing was returned because printing is a side effect.
2. Three parallel vectors with formula shorthand
Wrap loose vectors in list() when they are not already a data frame. The formula form uses ..1, ..2, ..3 to reach each input.
3. Pass extra arguments after the function
Anything after .f is forwarded to every call. Here a shared output folder is constant across all three writes.
The folder argument was supplied once through ... and reused for every file.
4. Match arguments with a named list
Name the elements of .l and pwalk() matches them to argument names. Order in the list no longer matters.
pwalk() vs pmap() vs walk2()
Pick the function by input count and whether you need a return value. All three iterate in parallel; they differ in arity and output.
| Function | Inputs | Returns | Use when |
|---|---|---|---|
pwalk() |
3 or more (a list) | .l invisibly |
Side effects over many parallel inputs |
pmap() |
3 or more (a list) | A list of results | You need the computed values back |
walk2() |
Exactly 2 | .x invisibly |
Side effects over two inputs only |
Decision rule: if you want the results, use pmap(). If you only want the side effect and you have three or more inputs, use pwalk(). With exactly two inputs, walk2() reads more clearly than wrapping them in a list.
walk2(x, y, f) is equivalent to pwalk(list(x, y), f). Reach for pwalk() the moment a third input appears.Common pitfalls
Most pwalk() errors trace back to mismatched lengths or names. Three mistakes account for nearly all of them.
Every vector in .l must share one length (or be length 1, which recycles). Trim or pad inputs before the call.
pwalk(list(a = x, b = y), function(p, q) ...), R cannot match a to p and the call fails. Either drop the names for positional matching or align them.The third trap is expecting output. result <- pwalk(...) gives you back .l, not the values your function computed. Switch to pmap() when you need results collected.
Try it yourself
Try it: Use pwalk() to print one line per row of a small data frame holding three fruit names and three prices. Save the data frame to ex_fruit first.
Click to reveal solution
Explanation: The data frame is a list of two equal-length columns, so pwalk() calls the function once per row with fruit and price matched by name.
Related purrr functions
- pmap() returns a list of results from parallel inputs.
- walk2() runs side effects over exactly two inputs.
- walk() handles the single-input side-effect case.
- iwalk() walks one input with access to its index or names.
- imap() maps over an input and its index together.
FAQ
What is the difference between pwalk() and pmap()?
Both iterate over a list of parallel inputs. pmap() collects what the function returns into a result list, so you use it for transformations. pwalk() discards return values and hands back the input list invisibly, so you use it for side effects like printing or writing files. If you find yourself ignoring the output of pmap(), switch to pwalk() to signal intent.
How does pwalk() handle a data frame?
A data frame is internally a list of equal-length columns. When you pass one to pwalk(), each column becomes one argument and each row supplies one set of values. The column names are matched to your function's argument names, which makes pwalk(df, f) a clean way to iterate over rows without apply() or an explicit loop.
Can pwalk() pass extra arguments to the function?
Yes. Any named argument placed after .f is forwarded through ... to every call. This is useful for values that stay constant across iterations, such as an output directory, a separator, or a connection. The constant argument is not part of .l, so it is not iterated.
Why does pwalk() return nothing visible?
pwalk() returns its input .l invisibly by design. Side-effect functions should not clutter the console with output, and returning .l lets pwalk() sit inside a pipe without breaking the chain. Wrap the call in invisible()-aware code or assign the result if you genuinely need the original list back.