tidyr crossing() in R: All Combinations (Alias for expand_grid)
The crossing() function in tidyr generates a tibble of all unique combinations of values from named vectors. It is an alias of expand_grid() with one extra step: it deduplicates the result.
crossing(year = 2020:2024, product = c("X","Y"))
crossing(x = c(1,1,2), y = c(3,3,4)) # dedupes inputs first
expand_grid(...) # similar, no dedup of inputs
expand.grid(...) # base R alternativeNeed explanation? Read on for examples and pitfalls.
What crossing() does in one sentence
crossing(...) returns a tibble of all unique combinations of values from named arguments, deduplicating inputs first. It is expand_grid() followed by an implicit unique step.
Syntax
crossing(...). ... is named vectors / lists.
crossing() and expand_grid() differ only in deduplication. expand_grid(x = c(1,1,2)) returns 3 rows; crossing(x = c(1,1,2)) returns 2 rows (deduped). For unique inputs, they are identical.Five common patterns
1. Standard combinations
2. Inputs with duplicates
3. Three-vector grid
4. List inputs
5. Use as join target
crossing() was the ORIGINAL tidyr function; expand_grid() was added later as a clearer name. Both still exist as aliases (with a small difference: crossing dedupes input values).crossing() vs expand_grid() vs expand() vs expand.grid
| Function | Dedupes inputs | Output | Origin |
|---|---|---|---|
crossing(...) |
Yes | Tibble | tidyr (original) |
expand_grid(...) |
No | Tibble | tidyr (newer) |
expand(data, ...) |
Yes (unique values) | Tibble | tidyr |
expand.grid(...) |
No | Data frame | base R |
When to use which:
- crossing for tidy combinations with input dedup.
- expand_grid for raw cartesian product.
- expand for combinations from existing data.
A practical workflow
Use crossing for "every possible scenario" tables.
For testing combinations, crossing is the cleanest tool.
Common pitfalls
Pitfall 1: row count growth. crossing(a = 1:10, b = 1:10, c = 1:10) returns 1,000 rows. Three big vectors -> millions of rows. Always check size.
Pitfall 2: dedupe surprise. If you DEPEND on duplicates appearing, use expand_grid instead of crossing.
crossing() quietly dedupes inputs but expand_grid() does not. This is the only behavioral difference. For most data with unique-by-default inputs, they're equivalent.Try it yourself
Try it: Generate all combinations of 3 ratings and 2 modes. Save to ex_grid.
Click to reveal solution
Explanation: 3 ratings * 2 modes = 6 combinations.
Related tidyr / base functions
After mastering crossing, look at:
expand_grid(): same but no dedupexpand(): from existing datacomplete(): expand + merge with datanesting(): preserve column pairsbase::expand.grid(): base R alternative
FAQ
What does crossing do in tidyr?
crossing(...) returns a tibble of all unique combinations of values from named vector arguments. Inputs are deduplicated first.
What is the difference between crossing and expand_grid?
crossing deduplicates inputs first. expand_grid does not. For unique inputs they're identical.
Which is preferred for new code?
Either works. The tidyr team uses both. expand_grid is more discoverable (the name describes the action); crossing is shorter.
Does crossing return a data frame or tibble?
A tibble. Use as.data.frame() if you need a base data frame.
What is the order of rows in crossing?
The LAST argument varies fastest. crossing(a=1:2, b=1:3) returns 6 rows, with b cycling fastest.