stringr str_which() in R: Find Indexes of Matches

The str_which() function in stringr returns the integer positions of strings that match a regex inside a character vector. It is the stringr equivalent of which(str_detect(x, pattern)) and a direct replacement for base R grep().

By Selva Prabhakaran · Published May 15, 2026 · Last updated May 15, 2026

⚡ Quick Answer

str_which(x, "pattern")                          # indexes of regex matches
str_which(x, fixed("text"))                      # literal (skip regex)
str_which(x, regex("a", ignore_case = TRUE))     # case insensitive
str_which(x, "pattern", negate = TRUE)           # indexes of non-matches
x[str_which(x, "pattern")]                       # subset by indexes
length(str_which(x, "pattern"))                  # count matching strings
which(str_detect(x, "pattern"))                  # base equivalent

Need explanation? Read on for examples and pitfalls.

📊 Is str_which() the right tool?

What str_which() does in one sentence

str_which(string, pattern) returns an integer vector of indexes pointing to every element of string whose pattern matches. Non-matching strings, including NAs, are simply omitted from the result.

The output length is the number of matches, not the input length. If nothing matches you get integer(0). That makes str_which() the natural input to indexing (x[idx]), assignment (x[idx] <- "fixed"), or any function expecting integer row positions, such as dplyr::slice().

Syntax

str_which(string, pattern, negate = FALSE). The pattern is a regex; wrap with fixed() for a literal match.

Run live

Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.

RLoad stringr and find matching indexes

library(stringr) x <- c("apple", "banana", "cherry", "Date", NA) str_which(x, "an") #> [1] 1 2

Two strings match: "apple" (contains "an") and "banana". The NA is silently dropped. Indexes are 1-based and reference positions in the input vector.

Tip

Use str_which() whenever you would otherwise write which(grepl(...)). It reads in one step, stays inside the stringr namespace, and treats NAs predictably (dropped, not returned as integer-NA).

Five common patterns

1. Basic regex match

RIndexes of strings containing 'an'

fruits <- c("apple", "banana", "cherry", "grape", "mango") str_which(fruits, "an") #> [1] 2 5

Default pattern is a regular expression. "banana" and "mango" both contain the literal substring "an", so positions 2 and 5 are returned.

2. Literal match with fixed()

RMatch a literal dot

versions <- c("1.0", "1x0", "2.5", "no version") str_which(versions, fixed(".")) #> [1] 1 3

Without fixed(), the . in regex means "any character" and would match every non-empty string. fixed("text") skips regex parsing entirely, the simplest way to match a literal substring.

3. Case insensitive match

RFind 'apple' regardless of case

items <- c("apple", "Apple", "APPLE", "banana") str_which(items, regex("apple", ignore_case = TRUE)) #> [1] 1 2 3

regex(pattern, ignore_case = TRUE) is the canonical case insensitive modifier. Three of the four strings match, irrespective of capitalization.

4. Negate (indexes of non-matches)

RStrings that DO NOT contain 'an'

str_which(fruits, "an", negate = TRUE) #> [1] 1 3 4

negate = TRUE flips the meaning. Equivalent to which(!str_detect(fruits, "an")). Use it when "find rows missing pattern X" reads cleaner than the double negative.

5. Subset and assign by index

RReplace matching elements in place

emails <- c("alice@x.com", "bob_AT_y.org", "carol@z.net", "no contact") bad <- str_which(emails, "AT", negate = FALSE) emails[bad] <- str_replace(emails[bad], "_AT_", "@") emails #> [1] "alice@x.com" "bob@y.org" "carol@z.net" "no contact"

This is the canonical reason to want indexes over values: you can both read and write back through the same positions. str_subset() would only let you read the matching strings.

Key Insight

Indexes are composable in a way that substrings are not. Once you have an integer vector of positions, you can pass it to [, slice(), seq_along(), or store it as a row pointer. Reach for str_which() whenever downstream code needs to refer back to the original input, not just the matched text.

str_which() vs str_subset() vs str_detect()

Pick the matcher whose return shape matches what you want to do next. The three functions share the same pattern engine and only differ in what they hand back.

Function	Returns	Length	Reach for it when you need
`str_which(x, p)`	integer indexes of matches	number of matches	row pointers, indexing, assignment
`str_subset(x, p)`	matching strings themselves	number of matches	the values, not their positions
`str_detect(x, p)`	logical vector	same as input	mask for `filter()`, `ifelse()`, or counts
`grep(p, x)`	integer indexes (base R)	number of matches	zero-dependency equivalent of str_which
`grep(p, x, value = TRUE)`	matching strings (base R)	number of matches	zero-dependency equivalent of str_subset

The stringr trio (str_which, str_subset, str_detect) is preferred inside tidyverse pipelines because every function in the family obeys the same NA, vectorization, and pattern modifier rules.

Common pitfalls

Pitfall 1: forgetting that the output length differs from the input. str_which(x, p) returns one element per match, not one per input string. Do not assume length(str_which(x, p)) == length(x); use str_detect() if you need a same-length vector.

Pitfall 2: special regex characters treated as patterns. str_which(x, "1.5") matches "1a5", "1-5", and any "1X5" sequence, not just literal "1.5". Use fixed("1.5") or escape the dot: "1\\.5".

Warning

str_which() silently drops NAs; it does not return integer-NA at their positions. str_which(c("a", NA, "ab"), "a") returns c(1, 3). If you need a same-length output (one entry per input), use str_detect() and apply which() yourself only after deciding how to handle NAs.

Try it yourself

Try it: Find the indexes of iris$Species (as a character vector) where the species name contains "color". Save the integer vector to ex_idx.

RYour turn: find indexes of 'color' species

species <- as.character(iris$Species) ex_idx <- # your code here length(ex_idx) #> Expected: 50

Click to reveal solution

RSolution

species <- as.character(iris$Species) ex_idx <- str_which(species, "color") length(ex_idx) #> [1] 50 head(ex_idx) #> [1] 51 52 53 54 55 56

Explanation: str_which(species, "color") returns the positions where "color" appears as a substring. Only the 50 "versicolor" rows match; they sit at indexes 51 through 100 in the species vector.

After mastering str_which, look at:

str_subset(): returns the matching strings (values, not indexes)
str_detect(): returns a logical vector for masking and filtering
str_locate(): returns positions WITHIN each string, not across the vector
str_extract(): pulls out the matched substring
grep(): base R equivalent of str_which() with no dependency

For complete regex grammar (anchors, classes, quantifiers, lookarounds), the official stringr regular expressions vignette is the authoritative reference.

FAQ

How do I find the index of a string matching a pattern in R?

Use stringr::str_which(x, "pattern"). It returns an integer vector of positions where the pattern matches in the character vector x. The output length equals the number of matches, not the input length. For a same-length logical vector, use str_detect() instead, then apply which() if you still want indexes.

What is the difference between str_which and str_subset in R?

Both filter a character vector by pattern, but they return different things. str_which(x, p) returns integer indexes (positions in x); str_subset(x, p) returns the matching strings themselves. Use str_which() when you need to refer back to the original positions for assignment or row indexing; use str_subset() when you only care about the values.

Is str_which the same as grep in R?

Yes, for typical use. str_which(x, p) and grep(p, x) both return integer indexes of matches. The differences are stylistic and ecosystem: str_which() lives in stringr and shares pattern modifiers (fixed(), regex(), coll(), boundary()) with the rest of the package, while grep() is base R with its own fixed = TRUE and ignore.case = TRUE arguments.

How do I do a case insensitive str_which match?

Wrap the pattern in regex(pattern, ignore_case = TRUE): str_which(x, regex("apple", ignore_case = TRUE)). The plain ignore.case argument from base R does not exist on stringr functions; modifier functions like regex(), fixed(), and coll() are the unified way to control match behavior.

Why does str_which return fewer values than my input has?

Because the result reports indexes of matches, not a verdict per input. Strings that do not match (including NAs) are omitted. If x has 10 elements and 3 match, str_which() returns a length-3 integer vector. Use str_detect() for a length-10 logical vector that aligns row-for-row with the input.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

stringr str_which() in R: Find Indexes of Matches

What str_which() does in one sentence

Syntax

Five common patterns

1. Basic regex match

2. Literal match with fixed()

3. Case insensitive match

4. Negate (indexes of non-matches)

5. Subset and assign by index

str_which() vs str_subset() vs str_detect()

Common pitfalls

Try it yourself

FAQ

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

stringr str_which() in R: Find Indexes of Matches

What str_which() does in one sentence

Syntax

Five common patterns

1. Basic regex match

2. Literal match with fixed()

3. Case insensitive match

4. Negate (indexes of non-matches)

5. Subset and assign by index

str_which() vs str_subset() vs str_detect()

Common pitfalls

Try it yourself

Related stringr functions

FAQ

Related Tutorials