stringr str_split() in R: Split String by Pattern
The str_split() function in stringr divides each input string into pieces by a pattern, returning a LIST of character vectors. Use str_split_fixed() when you need a fixed-column matrix.
str_split(x, "/") # list (one element per string) str_split(x, "/", n = 2) # split into at most 2 pieces str_split_fixed(x, "/", n = 3) # matrix with 3 columns str_split_1(x, "/") # single string -> vector (stringr 1.5+) str_split_i(x, "/", i = 2) # extract i-th piece unlist(str_split(x, "/")) # flatten to single vector str_split(x, regex("[,;]")) # multiple delimiters via regex
Need explanation? Read on for examples and pitfalls.
What str_split() does in one sentence
str_split(string, pattern) returns a LIST where each element is a character vector of pieces from the corresponding input string. This is necessary because different strings may produce different numbers of pieces.
For most uses, you immediately want one of: a vector (single input), a matrix (consistent piece count), or a specific piece (i-th element). stringr provides specialized functions for each case.
Syntax
str_split(string, pattern, n = Inf). Returns a list. Pattern is regex by default.
The result is a list of length 3 (one per input string). Each element is a character vector of pieces.
str_split_fixed(x, "/", n = 3) returns a 3-column matrix. Strings with fewer pieces get empty strings; strings with more pieces have the rest concatenated into the last column. Use this when you need a tabular shape and know the maximum piece count.Five common patterns
1. Basic split
2. Split into matrix
The second row has only 2 pieces; the missing third becomes an empty string.
3. Single string -> vector
str_split_1() (stringr 1.5+) is for the common case of splitting ONE string. Returns a flat vector instead of a 1-element list.
4. Extract i-th piece
str_split_i() does split + indexing in one call. Cleaner than sapply(str_split(x, "-"), \(p) p[2]).
5. Multi-character or regex delimiter
Pass a regex to split on multiple delimiters. The character class [,;] matches either comma or semicolon.
str_split_fixed() when output shape matters (data frame columns), str_split() for general flexibility, str_split_i() for extracting one piece, and str_split_1() for a single string. Each variant is optimized for a specific case; mixing them up leads to awkward unlist/sapply patterns.Common pitfalls
Pitfall 1: forgetting that str_split() returns a LIST. Even for a single input, str_split("a/b", "/") returns a list of one element, not a vector. Use str_split_1() for vector output.
Pitfall 2: regex pattern when literal was intended. str_split(x, ".") splits on EVERY character (regex .). Use fixed(".") to split on actual dot.
str_split_fixed(x, p, n = 3) SILENTLY truncates strings that produce more than n pieces. If a string has 4 pieces with n = 3, the last 2 are joined into the third column. Set n high enough or use plain str_split() if you need every piece.Try it yourself
Try it: Split each email address in emails to get the domain part (after the @). Save to ex_domains.
Click to reveal solution
Explanation: Each email splits into 2 parts at "@". str_split_i(..., i = 2) extracts the second piece from each, giving the domain.
Related stringr functions
After mastering str_split, look at:
tidyr::separate_wider_delim(): split a column in a data framestr_subset(): filter strings matching a patternstr_extract(): extract matched substringsstr_count(): count delimiter occurrences (predict pieces)strsplit(): base R equivalent
For data frame columns, prefer tidyr::separate_wider_delim() over manually splitting and reshaping.
FAQ
How do I split a string by delimiter in R?
Use stringr::str_split(x, "delimiter") for general use (returns list). For a single string, str_split_1(x, "delimiter") returns a vector. For a matrix, str_split_fixed(x, "delimiter", n).
What is the difference between str_split and strsplit?
Both return a list of character vectors. str_split() is from stringr, more consistent (NA -> NA), and integrates better with the tidyverse. strsplit() is base R with no package dependency.
How do I split into a fixed number of pieces?
Use str_split_fixed(x, pattern, n = N) for a matrix with exactly N columns. Or str_split(x, pattern, n = N) to limit pieces in the list output (everything past N stays joined).
How do I get just one piece from each split?
Use str_split_i(x, pattern, i = 2) to extract the i-th piece from each split. Cleaner than mapping over the list output.
How do I split a column in a data frame?
Use tidyr::separate_wider_delim(df, cols = col, delim = "/", names = c("a","b","c")). It is the data-frame-specific tool, while str_split is for raw strings.