dplyr matches() in R: Select Columns by Regex

The matches() helper in dplyr selects columns whose names match a REGULAR EXPRESSION. It is the regex tidyselect helper, more flexible than starts_with, ends_with, or contains.

By Selva Prabhakaran · Published May 12, 2026 · Last updated May 12, 2026

⚡ Quick Answer

df |> select(matches("^score"))             # regex prefix
df |> select(matches("\\d+$"))              # ends with digits
df |> select(matches("^[A-Z]_\\w+"))        # complex pattern
df |> select(matches("score|rating"))       # alternation
df |> mutate(across(matches("^q\\d+$"), as.factor))

Need explanation? Read on for examples and pitfalls.

📊 Is matches() the right tool?

What matches() does in one sentence

matches(match, ignore.case = TRUE, perl = FALSE, vars = NULL) selects columns whose names match the regex match. The most flexible name-based tidyselect helper.

Syntax

matches(match, ignore.case = TRUE, perl = FALSE, vars = NULL). Standard regex.

Run live

Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.

RColumns whose names start with 'q' followed by digits

library(dplyr) df <- tibble(q1 = 1, q2 = 2, q12 = 3, qa = 4, score = 5) df |> select(matches("^q\\d+$")) #> q1 q2 q12 (qa dropped, no digit; score dropped, no q)

Tip

Reach for matches when literal helpers (starts_with, ends_with, contains) can't express the pattern. For simple cases, the literal helpers are clearer.

Five common patterns

1. Regex prefix

RSame as starts_with but with regex

df |> select(matches("^score"))

^ anchors to start.

2. Regex suffix

RNames ending in digits

df |> select(matches("\\d+$"))

$ anchors to end.

3. Alternation

RMultiple patterns at once

df |> select(matches("score|rating|score_pct"))

| is OR in regex.

4. Character class

RNames with format X_word

df |> select(matches("^[A-Z]_\\w+"))

[A-Z] is uppercase; \\w+ is word characters.

5. Multi-step transform

RConvert all q1, q2, ... to factor

df |> mutate(across(matches("^q\\d+$"), as.factor))

Key Insight

matches() is the only tidyselect helper that supports REGEX. Everything else (starts_with, ends_with, contains) uses literal strings. Use matches when the pattern is too complex for the literals.

matches() vs starts_with / ends_with / contains

Helper	Matches	Best for
`starts_with("x")`	Literal prefix	Simple prefixes
`ends_with("y")`	Literal suffix	Simple suffixes
`contains("ab")`	Literal substring	Substring anywhere
`matches("regex")`	Regex	Complex patterns

When to use which:

Use literal helpers when possible (faster, clearer).
Reach for matches only when regex is needed.

A practical workflow

Use matches for column names with structured patterns.

RInteractive R

# Q1, Q2, ..., Q20 columns -> all factors df |> mutate(across(matches("^Q\\d+$"), as.factor)) # Year-stamped columns 2020-2024 df |> select(matches("_(202[0-4])$"))

For survey data with structured names, matches is essential.

Common pitfalls

Pitfall 1: regex special characters. matches(".") matches every column (any character). Use matches("\\.") for literal period.

Pitfall 2: case-insensitive default. matches("score") matches "SCORE" and "ScOrE". Pass ignore.case = FALSE for strict.

Warning

Backslashes in R strings are double-escaped. Regex \d in a string is "\\d". Common bug: writing matches("\d+") (errors).

Try it yourself

Try it: Select all iris columns ending in either "Length" or "Width". Save to ex_dims.

RYour turn: dimension columns

ex_dims <- iris |> # your code here names(ex_dims) #> Expected: 4 columns (Sepal/Petal Length/Width)

Click to reveal solution

RSolution

ex_dims <- iris |> select(matches("(Length|Width)$")) names(ex_dims) #> [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"

Explanation: (Length|Width)$ matches either word at the end of the name.

After mastering matches, look at:

starts_with() / ends_with() / contains(): literal helpers
everything(): all remaining
where(): predicate
all_of() / any_of(): explicit list
num_range(): numeric-suffixed names

For 99% of name-based selection, the literal helpers are simpler and faster than matches.

FAQ

What does matches do in dplyr?

matches(pattern) selects columns whose names match the regex pattern. Tidyselect helper for regex-based selection.

Is matches case-sensitive?

No by default. Pass ignore.case = FALSE for strict matching.

What is the difference between matches and contains?

matches uses regex; contains is literal substring. matches("a.b") is "a, any char, b"; contains("a.b") is the literal "a.b".

How do I anchor matches to start or end?

Use regex anchors: ^ for start (matches("^score")); $ for end (matches("score$")).

Why does my pattern with backslashes error?

R strings double-escape backslashes. Regex \d is the string "\\d". Use matches("\\d+"), not matches("\d+").

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

dplyr matches() in R: Select Columns by Regex

What matches() does in one sentence

Syntax

Five common patterns

1. Regex prefix

2. Regex suffix

3. Alternation

4. Character class

5. Multi-step transform

matches() vs starts_with / ends_with / contains

A practical workflow

Common pitfalls

Try it yourself

FAQ

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

dplyr matches() in R: Select Columns by Regex

What matches() does in one sentence

Syntax

Five common patterns

1. Regex prefix

2. Regex suffix

3. Alternation

4. Character class

5. Multi-step transform

matches() vs starts_with / ends_with / contains

A practical workflow

Common pitfalls

Try it yourself

Related tidyselect helpers

FAQ