tidyr separate_wider_delim() in R: Split Column by Delimiter Into Cols
The separate_wider_delim() function in tidyr 1.3 splits one column into multiple columns based on a delimiter. It is the modern, type-safe replacement for the deprecated separate().
df |> separate_wider_delim(col, delim = "_", names = c("a","b","c"))
df |> separate_wider_delim(col, delim = ",", names_sep = "_")
df |> separate_wider_delim(col, delim = "-", names = c("y","m","d"), too_few = "align_start")
df |> separate_wider_delim(col, delim = "_", names = c("a","b"), too_many = "merge")
df |> tidyr::separate(col, into = c("a","b"), sep = "_") # supersededNeed explanation? Read on for examples and pitfalls.
What separate_wider_delim() does in one sentence
separate_wider_delim(data, cols, delim, names) splits the values of cols by delim and puts the parts into NEW columns named in names. Replaces the older separate() with type-safe column count handling.
Syntax
separate_wider_delim(data, cols, delim, names = NULL, names_sep = NULL, too_few = "error", too_many = "error", cols_remove = TRUE).
separate_wider_delim() ERRORS by default if rows have too few or too many parts after splitting. Pass too_few = "align_start" or too_many = "merge" to control this.Five common patterns
1. Standard split
2. Handle uneven splits with too_few
3. Merge extra parts
4. Auto-name with names_sep
5. Keep original column
separate_wider_delim() is type-safe by default: too_few and too_many are explicit choices. This is the main improvement over separate(), which silently produced inconsistent results.separate_wider_delim() vs separate() vs str_split
| Function | Output | Status |
|---|---|---|
separate_wider_delim() |
Multiple columns | Recommended (1.3+) |
separate_wider_position() |
Multi cols by position | Recommended |
separate_wider_regex() |
Multi cols by regex | Recommended |
tidyr::separate() |
Multiple columns | Superseded |
stringr::str_split() |
List of vectors | Manual workflow |
When to use which:
- separate_wider_delim for delimiter-based wider split.
- separate (old) only in legacy code.
- str_split for vector-level work outside data frames.
A practical workflow
Use separate_wider_delim for parsing structured strings into columns.
Parse log entries into structured columns. too_many = "merge" handles cases where the message itself contains the delimiter.
Common pitfalls
Pitfall 1: too_few = "error" by default. If any row has fewer parts than expected, separate_wider_delim errors. Switch to "align_start" or "align_end" to tolerate.
Pitfall 2: forgetting names. You must provide column names via names (or use names_sep for auto-naming).
separate_wider_delim() requires tidyr 1.3+ (Jan 2023). Earlier versions only have the superseded separate(). Check version with packageVersion("tidyr").Try it yourself
Try it: Split a full_name column into first and last. Save to ex_split.
Click to reveal solution
Explanation: Split each full_name on " " into first and last columns.
Related tidyr functions
After mastering separate_wider_delim, look at:
separate_wider_position(): split by character positionseparate_wider_regex(): split by regex groupsseparate_longer_delim(): split into ROWS instead of columnsunite(): opposite (combine columns into one)tidyr::separate(): superseded predecessor
FAQ
What does separate_wider_delim do in tidyr?
It splits one column into multiple new columns based on a delimiter. Replaces the older separate() with type-safe handling of uneven splits.
What is the difference between separate_wider_delim and separate?
separate (old) is superseded. separate_wider_delim has explicit too_few / too_many handling, making errors visible instead of silent.
How do I handle rows with different numbers of parts?
Use too_few = "align_start" or "align_end" for fewer parts; too_many = "merge" to put extras in the last column. Without these, mismatch errors.
Can I keep the original column?
Yes. Pass cols_remove = FALSE to keep the input column alongside the new ones.
What if my delimiter is a regex special character?
Pass it as a literal string. separate_wider_delim treats delim as a literal substring, not regex.