strsplit() in R: Split Strings by a Delimiter

The strsplit() function in base R splits each element of a character vector at a delimiter (regex or fixed) and returns a LIST of character vectors, one per input element.

⚡ Quick Answer
strsplit("a,b,c", ",")[[1]]                # c("a","b","c")
strsplit(c("a,b","c,d"), ",")              # list of vectors
strsplit("a.b.c", ".", fixed = TRUE)[[1]]  # literal period
strsplit("a.b.c", "\\.")                   # regex period
unlist(strsplit(x, ","))                   # flatten to vector
do.call(rbind, strsplit(x, ","))           # to matrix (if even)
stringr::str_split(x, ",")                 # tidyverse equivalent

Need explanation? Read on for examples and pitfalls.

📊 Is strsplit() the right tool?
STARTsplit each string by delimiterstrsplit()split into matrix (equal lengths)do.call(rbind, strsplit(...))split a single stringstrsplit(x, sep)[[1]]tidyverse pipelinestringr::str_split()split at fixed positionssubstring() / substr()split data frame column into multipletidyr::separate()maximum N piecesstringr::str_split_fixed() or strsplit + head

What strsplit() does in one sentence

strsplit(x, split) splits each string in x at every occurrence of split (regex by default) and returns a LIST where each element is a character vector of the parts. The list has length equal to the input.

The list output is intentional: different inputs can produce different numbers of parts. To collapse to a vector, unlist(). To force into a matrix, use do.call(rbind, ...) only when all inputs split into equal counts.

Syntax

strsplit(x, split, fixed = FALSE, perl = FALSE). Returns a LIST.

Run live
Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.
RSplit a single string
strsplit("apple,banana,cherry", ",") #> [[1]] #> [1] "apple" "banana" "cherry" strsplit("apple,banana,cherry", ",")[[1]] #> [1] "apple" "banana" "cherry"

  
Tip
strsplit always returns a LIST, even for length-1 input. Use [[1]] to extract the vector for single inputs. For vectorized work, leave it as a list.

Five common patterns

1. Split CSV-like single string

RComma-separated to vector
strsplit("a,b,c", ",")[[1]] #> [1] "a" "b" "c"

  

2. Split a vector of strings

REach input becomes a list element
x <- c("a,b,c", "x,y", "p,q,r,s") strsplit(x, ",") #> [[1]] #> [1] "a" "b" "c" #> [[2]] #> [1] "x" "y" #> [[3]] #> [1] "p" "q" "r" "s"

  

Variable lengths are why the result is a list, not a matrix.

3. Split by regex (e.g., any whitespace)

RSplit on multiple whitespace
strsplit("hello world foo", "\\s+") #> [[1]] #> [1] "hello" "world" "foo"

  

\\s+ matches one-or-more whitespace, so multi-space gaps collapse.

4. Split a column into a matrix (equal-length only)

RConvert to matrix when each row splits the same
x <- c("2024-01-15", "2025-03-20", "2026-07-04") parts <- strsplit(x, "-") do.call(rbind, parts) #> [,1] [,2] [,3] #> [1,] "2024" "01" "15" #> [2,] "2025" "03" "20" #> [3,] "2026" "07" "04"

  

Works only when all inputs split into the same number of pieces.

5. Split with fixed = TRUE (literal delimiter)

RPeriod as literal, not regex
strsplit("file.name.txt", ".", fixed = TRUE)[[1]] #> [1] "file" "name" "txt" strsplit("file.name.txt", ".")[[1]] # WRONG: regex #> [1] "" "" "" "" "" "" "" "" "" "" "" "" ""

  

Without fixed = TRUE, . is regex (any character), splitting at every position.

Key Insight
Use fixed = TRUE for ANY single-character delimiter that has regex meaning. . * + ? | ( [ ^ $ \\ are all regex. For these, fixed = TRUE is faster and avoids escaping bugs. For literal commas, slashes, dashes, it does not matter (they are not regex special).

strsplit() vs str_split() vs separate() vs substr()

Four ways to split strings in R, each suited to different shapes.

Function Package Output Best for
strsplit() base List of vectors Standard delimiter split
stringr::str_split() stringr List of vectors Tidyverse pipelines
stringr::str_split_fixed() stringr Matrix Equal-length splits
tidyr::separate() tidyr Multi-column data frame Data frame column splits
substr() / substring() base String Fixed-position split

When to use which:

  • strsplit for base R delimiter splits.
  • str_split_fixed when you know the max number of pieces.
  • separate to split a data frame column into named columns.
  • substr for known-position splits (e.g., always at index 5).

A practical strsplit workflow

Most strsplit workflows go: split, validate, transform. Common patterns:

  1. Split-and-flatten: unlist(strsplit(x, sep)) to get one big vector.
  2. Split-and-stack: do.call(rbind, strsplit(x, sep)) to get a matrix.
  3. Split-and-extract-Nth: sapply(strsplit(x, sep), [, 2) to grab the 2nd part of each.
  4. Split-and-re-aggregate: split by some boundary, then process group-wise.

For data-frame work, tidyr::separate() is usually cleaner because it produces named columns directly.

Common pitfalls

Pitfall 1: result is always a list. strsplit("a,b", ",") returns a list, not a vector. [[1]] extracts the inner vector for single-input cases.

Pitfall 2: do.call(rbind, ...) errors on unequal lengths. If inputs split into different counts, rbind fails. Use stringr::str_split_fixed() or pad results manually.

Warning
Empty strings can produce surprises. strsplit("a,,b", ",")[[1]] returns c("a", "", "b") (with empty middle). strsplit("", ",")[[1]] returns an empty character vector. Always handle these edge cases.

Try it yourself

Try it: Split each email at "@" and extract just the domain. Save to ex_domains.

RYour turn: extract domains
emails <- c("alice@example.com", "bob@gmail.com", "carol@yahoo.com") ex_domains <- # your code here ex_domains #> Expected: c("example.com", "gmail.com", "yahoo.com")

  
Click to reveal solution
RSolution
ex_domains <- sapply(strsplit(emails, "@"), `[`, 2) ex_domains #> [1] "example.com" "gmail.com" "yahoo.com"

  

Explanation: strsplit(emails, "@") returns a list of 2-element vectors. sapply(..., [, 2) extracts the 2nd element from each, which is the domain part.

After mastering strsplit, look at:

  • stringr::str_split(): tidyverse equivalent
  • stringr::str_split_fixed(): returns a matrix directly
  • tidyr::separate(): split a data frame column into multiple columns
  • tidyr::separate_rows(): split into multiple ROWS (long format)
  • substr() / substring(): split at fixed positions
  • regmatches(): split using regex matches

For data frame column splits, tidyr::separate() saves a step over strsplit + bind.

FAQ

How do I split a string in R?

strsplit(x, ",")[[1]] for a single string. For a vector, leave off [[1]] and you get a list.

Why does strsplit return a list?

Different inputs can split into different numbers of parts. A list naturally holds variable-length results. For single-input cases, [[1]] extracts the vector.

What is the difference between strsplit and stringr str_split?

Both split a string by a regex pattern. str_split puts the string argument FIRST (pipe-friendly): str_split(x, ","). strsplit puts pattern second: strsplit(x, ","). Same result; different argument order.

How do I split into a fixed number of pieces?

stringr::str_split_fixed(x, ",", n = 3) returns a 3-column matrix. Or strsplit then head(parts, 3) per element.

How do I split a string into individual characters?

strsplit(x, "")[[1]]. Empty delimiter splits at every position. Or use unlist(strsplit("hello", "")) for a vector.