dplyr contains() in R: Select Columns by Substring

The contains() helper in dplyr selects columns whose names CONTAIN a given substring (anywhere in the name). It is the substring-match tidyselect helper, complementing starts_with and ends_with.

By Selva Prabhakaran · Published May 11, 2026 · Last updated May 11, 2026

⚡ Quick Answer

df |> select(contains("score"))             # any column with "score" in name
df |> select(contains("Length"))            # case-insensitive default
df |> select(contains("X", ignore.case = FALSE))
df |> mutate(across(contains("amt"), ~ .x * 1.1))
df |> select(-contains("temp"))             # drop substring-matched

Need explanation? Read on for examples and pitfalls.

📊 Is contains() the right tool?

What contains() does in one sentence

contains(match) selects columns whose names contain the literal substring match anywhere. Used inside dplyr verbs that support tidyselect.

Syntax

contains(match, ignore.case = TRUE, vars = NULL). Substring match, not regex.

Run live

Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.

RAll columns containing 'Length'

library(dplyr) iris |> select(contains("Length")) |> head(3) #> Sepal.Length Petal.Length #> 1 5.1 1.4

Tip

Use contains for words that may appear ANYWHERE in column names. When unsure if a token is a prefix or suffix, contains catches both.

Five common patterns

1. Substring match

RAny 'score' column

df <- tibble(score_a = 1, x_score = 2, total = 3) df |> select(contains("score")) #> score_a x_score

Both "score_a" and "x_score" match.

2. Apply across by substring

RRound all amount-related columns

df |> mutate(across(contains("amt"), round, 2))

3. Drop by substring

RRemove all temp-related

df |> select(-contains("temp"))

4. Case-sensitive

Rignore.case = FALSE

df <- tibble(SCORE = 1, score = 2) df |> select(contains("score", ignore.case = FALSE)) #> score

5. Multiple substrings

Rcontains accepts a vector

df |> select(contains(c("score", "rating"))) #> Names containing either "score" or "rating"

Key Insight

contains is the most flexible name-based selector. It catches matches anywhere; starts_with and ends_with are stricter. Use contains when you don't know exactly where the token sits in the name.

contains() vs starts_with() vs ends_with() vs matches()

Helper	Matches
`starts_with("x")`	Prefix
`ends_with("y")`	Suffix
`contains("ab")`	Anywhere
`matches("regex")`	Regex

Use contains when the substring's position varies.

A practical workflow

The "audit" pattern uses contains for fuzzy matching of token names.

RInteractive R

df |> summarise(across(contains("amount"), ~ sum(is.na(.x))))

NA counts for any column with "amount" in the name. Robust to naming inconsistencies.

For renaming groups of columns:

RInteractive R

df |> rename_with(toupper, contains("score"))

Uppercase any column with "score" in its name.

Common pitfalls

Pitfall 1: contains is literal, not regex. contains("a.b") matches the literal "a.b" (dot included). For regex, use matches.

Pitfall 2: case-insensitive default surprises. contains("ID") matches "user_id" and "ID_2" because of ignore.case = TRUE. Pass FALSE if strict.

Warning

contains() matches MULTIPLE substrings if you pass a vector. contains(c("a","b")) selects names containing either "a" OR "b", NOT both. For "AND" logic, use & between two contains calls.

Try it yourself

Try it: Select all iris columns containing "Petal". Save to ex_petal.

RYour turn: petal columns

ex_petal <- iris |> # your code here names(ex_petal) #> Expected: c("Petal.Length", "Petal.Width")

Click to reveal solution

RSolution

ex_petal <- iris |> select(contains("Petal")) names(ex_petal) #> [1] "Petal.Length" "Petal.Width"

Explanation: Two iris columns contain "Petal". Sepal.* columns are excluded.

After mastering contains, look at:

starts_with() / ends_with(): stricter position-based
matches(): regex
everything(): all
where(): predicate
all_of() / any_of(): explicit name vector

For complex patterns, combine helpers with &, |, !.

FAQ

What does contains do in dplyr?

contains(match) selects columns whose names contain the substring match anywhere.

Is contains case-sensitive?

No by default. Pass ignore.case = FALSE for strict matching.

Can contains accept multiple substrings?

Yes. contains(c("a","b")) matches names containing either "a" OR "b" (not both).

What is the difference between contains and matches?

contains is literal substring; matches uses regex. contains(".") matches a literal period; matches(".") is "any character".

How do I require a column to contain BOTH "a" AND "b"?

Combine: contains("a") & contains("b"). Both conditions must match.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

dplyr contains() in R: Select Columns by Substring

What contains() does in one sentence

Syntax

Five common patterns

1. Substring match

2. Apply across by substring

3. Drop by substring

4. Case-sensitive

5. Multiple substrings

contains() vs starts_with() vs ends_with() vs matches()

A practical workflow

Common pitfalls

Try it yourself

FAQ

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

dplyr contains() in R: Select Columns by Substring

What contains() does in one sentence

Syntax

Five common patterns

1. Substring match

2. Apply across by substring

3. Drop by substring

4. Case-sensitive

5. Multiple substrings

contains() vs starts_with() vs ends_with() vs matches()

A practical workflow

Common pitfalls

Try it yourself

Related tidyselect helpers

FAQ