purrr cross() in R: Generate All List Combinations
The purrr cross() function generates every combination of elements from a set of lists, the Cartesian product, returned as a list of lists. It is the building block for grid searches, parameter sweeps, and exhaustive test cases.
cross(list(a = 1:2, b = 3:4)) # all combinations as a list cross2(1:3, c("x", "y")) # combinations of two vectors cross3(1:2, 1:2, 1:2) # combinations of three cross_df(list(a = 1:2, b = 3:4)) # combinations as a data frame cross2(1:3, 1:3, .filter = `==`) # drop pairs that match expand_grid(a = 1:2, b = 3:4) # modern, non-deprecated replacement
Need explanation? Read on for examples and pitfalls.
What cross() does in purrr
The cross function builds a Cartesian product. You pass it a list of input vectors or lists, and it returns one element for every possible pairing of values across those inputs. Cross two inputs of length two and three, and you get six combinations back. The result is a flat list, and each element is itself a small list that holds exactly one value drawn from each input.
This pattern appears constantly in real data work. A hyperparameter grid search needs every pairing of learning rate and tree depth. A simulation study needs every combination of sample size and effect size. The cross family turns a handful of short vectors into the full grid, so you loop over it once with no nested loops to maintain.
cross() syntax and arguments
The signature is short, and the .filter argument is the powerful part. Every member of the cross family shares the same basic shape, differing only in how many inputs it accepts.
The arguments work as follows:
.lis a named list of vectors or lists to cross. Names carry through, so each output element is labelled..x,.y, and.zare individual vectors for the fixed-arity helpers cross2 and cross3..filteris a predicate function of two arguments. Any combination for which it returns TRUE is removed from the result before it is handed back.
cross() examples
Start with a named list so the output stays labelled. Passing a named list means every combination carries the input names, which makes the result easy to read and easy to feed into pmap later.
The first input varies fastest, so element one is the small-and-red pairing, element two is large-and-red, and the blue pairings follow. When you want a rectangular result instead of a nested list, reach for the data-frame variant, which returns a tibble with one column per input.
A common workflow pairs the data-frame output with pmap to evaluate something at every grid point. Tibble columns line up with the function arguments, so each row triggers one call.
The .filter argument trims the grid before it is returned. The predicate receives two values, and any combination it flags as TRUE is dropped. A frequent use is removing pairs where the two values are equal, which leaves only the off-diagonal entries.
A full cross of the integers one through three with themselves has nine pairs. The equality predicate flags the three pairs with matching values, so six combinations remain.
cross() vs expand_grid(): the modern replacement
The expand_grid function from tidyr does the same job and is not deprecated. It accepts inputs directly as named arguments and returns a tibble, so it drops straight into a tidyverse pipeline without any conversion step.
Notice the ordering difference. The expand_grid function varies its last argument fastest, while the cross family varies its first input fastest. The set of combinations is identical in both cases; only the row order changes. The table below summarizes when to pick each option.
| Function | Output | Status | Best for |
|---|---|---|---|
cross() / cross_df() |
list of lists / tibble | Deprecated | Legacy purrr code |
expand_grid() |
tibble | Active | New tidyverse code |
expand.grid() |
data.frame | Active (base R) | Base-only scripts |
To run a function over the modern grid, pair expand_grid with pmap exactly as with the data-frame cross variant.
Common pitfalls
Three mistakes account for most cross confusion.
- Forgetting the deprecation warning. In a fresh session the cross family prints a lifecycle warning the first time it runs. It is a warning, not an error, but it clutters logs and worries reviewers. Switching to expand_grid removes the noise entirely.
- Expecting cross and expand_grid to agree on row order. They produce the same combinations in a different sequence. Never compare their outputs row by row without sorting both first.
- Passing an unnamed list to cross. The result elements then have no names, so any downstream pmap call that relies on argument names will fail. Always name the inputs in the list you supply.
Try it yourself
Try it: Use a cross helper to build every combination of three two-element vectors, then confirm the count. Save the result to ex_combos.
Click to reveal solution
Explanation: The cross3 helper crosses three inputs, so the count is two times two times two, which equals eight. Each element is a list of three values, one drawn from each vector.
Related purrr functions
These functions show up alongside the cross family in iteration code:
map()applies a function to each element of one list.map2()walks two lists in parallel without crossing them, so the inputs must be the same length.pmap()applies a function across rows of a data frame or list of lists, the natural partner for cross_df and expand_grid.transpose()flips a list of lists, turning per-combination records into per-field columns.cross2()andcross_df()are the fixed-arity and data-frame variants of the plain cross function.
For the bigger picture of list iteration in R, see the Functional Programming in R guide. The official function reference lives at purrr.tidyverse.org.
FAQ
Is purrr cross() deprecated?
Yes. The cross, cross2, cross3, and cross_df functions were all deprecated in purrr 1.0.0, released in December 2022. They still work and return correct results, but they emit a lifecycle warning the first time they run in a session. The tidyverse team recommends the expand_grid function from tidyr for all new code.
What is the difference between cross() and expand_grid()?
Both produce the Cartesian product of their inputs. The cross function returns a list of lists and varies its first input fastest. The expand_grid function returns a tibble and varies its last input fastest. The combinations themselves are identical; only the container type and the row order differ. Because expand_grid is the maintained function, prefer it unless you specifically need the nested-list shape.
How do I run a function over every combination?
Build the grid with cross_df or expand_grid, then feed it to pmap. The pmap call evaluates the function once per combination, matching the grid column names to the function argument names. This pattern powers grid search and parameter sweeps.
What does the .filter argument do in cross()?
The .filter argument takes a predicate function of two arguments. The cross function applies that predicate to each candidate combination and removes the ones where it returns TRUE. For example, crossing the integers one through three with themselves and filtering on equality drops every pair whose two values match, leaving only the off-diagonal combinations.
Can I cross more than three vectors?
Yes. The plain cross function and cross_df accept a list of any length, so you can cross four or more inputs in one call. The numbered helpers stop at cross3. With many inputs the result grows multiplicatively, so a list of five vectors with ten elements each yields one hundred thousand combinations.