tidyr separate_longer_position() in R: Split Into Rows by Position
The separate_longer_position() function in tidyr 1.3 splits a string column at FIXED CHARACTER WIDTHS, creating one row per chunk. Useful for fixed-width strings that should expand to multiple rows.
df |> separate_longer_position(col, width = 1) # one char per row df |> separate_longer_position(col, width = 2) # 2-char chunks per row df |> separate_longer_delim(col, delim = ",") # delimiter alternative df |> separate_wider_position(col, widths = c(...)) # to columns instead
Need explanation? Read on for examples and pitfalls.
What separate_longer_position() does in one sentence
separate_longer_position(data, cols, width) splits each value of cols into chunks of width characters and creates one row per chunk. Other columns' values are duplicated for each new row.
Syntax
separate_longer_position(data, cols, width). width is an integer.
mutate to assign a chunk ID.Five common patterns
1. Single-character split
2. Two-character chunks
3. Combine with row index
4. Decode bit-string
5. Rare: variable-row decoding
separate_longer_position is uncommon, most fixed-width data goes wider (into columns), not longer (into rows). Reach for it when each character (or fixed chunk) is its own observation, not a field of an observation.separate_longer_position() vs separate_longer_delim() vs strsplit
| Function | Splits by | Output |
|---|---|---|
separate_longer_position() |
Fixed width | New rows |
separate_longer_delim() |
Delimiter | New rows |
separate_wider_position() |
Fixed width | New columns |
strsplit(x, "") |
Per character | List vectors |
When to use which:
- separate_longer_position for fixed chunks to rows.
- separate_longer_delim for delimited to rows.
- separate_wider_position for fixed widths to columns.
A practical workflow
Use for character-level analysis of strings within a tidy framework.
For per-letter analysis with row indices.
Common pitfalls
Pitfall 1: width must divide string length cleanly. A 5-char string with width=2 produces "AA","BB","C" (the last chunk is shorter). Verify if this is desired.
Pitfall 2: confusing wider vs longer. wider creates COLUMNS; longer creates ROWS. Pick by output shape.
separate_longer_position() requires tidyr 1.3+. Earlier versions don't have it. The pre-1.3 alternative was hand-rolled with substring + bind_rows.Try it yourself
Try it: Take a single string "ABCDEF" and split into 2-character chunks (rows). Save to ex_chunks.
Click to reveal solution
Explanation: Each 2-character chunk becomes a separate row.
Related tidyr functions
After mastering separate_longer_position, look at:
separate_longer_delim(): delimiter-basedseparate_wider_position(): fixed widths to columnsunnest_longer(): list column to rowsstrsplit(): base R alternative
FAQ
What does separate_longer_position do in tidyr?
Splits a string column into chunks of fixed character width and creates one row per chunk.
What if width does not divide the string length cleanly?
The last chunk is whatever characters remain (shorter than width). No error.
What is the difference between separate_longer_position and separate_longer_delim?
position uses fixed character widths. delim uses a delimiter. Use position when chunks are equal-length; delim for variable parts.
Can I use this for character-level analysis?
Yes. Pass width = 1 to get one row per character.
Is separate_longer_position the same as strsplit?
Similar but: separate_longer_position is dplyr/tidyverse-friendly, returns a tibble, integrates with group_by. strsplit is base R, returns a list.