dplyr num_range() in R: Select Numeric-Suffixed Columns
The num_range() helper in dplyr selects columns with a prefix followed by a numeric range, like q1, q2, q3 or year_2020, year_2021. It is the explicit numeric-suffix tidyselect helper.
df |> select(num_range("q", 1:5)) # q1, q2, q3, q4, q5
df |> select(num_range("year_", 2020:2024)) # year_2020 to year_2024
df |> select(num_range("Q", 1:10, width = 2)) # Q01, Q02, ..., Q10 (padded)
df |> select(matches("^q\\d+$")) # regex alternative
df |> select(starts_with("q")) # less precise alternativeNeed explanation? Read on for examples and pitfalls.
What num_range() does in one sentence
num_range(prefix, range, width = NULL) selects columns named prefix followed by an integer in range, optionally zero-padded to width digits. Matches q1, q2, q3 or Q01, Q02 depending on width.
Syntax
num_range(prefix, range, width = NULL). width pads with leading zeros if set.
Five common patterns
1. Sequential quarters
2. Year-range
3. Zero-padded
4. Apply across to numeric-suffixed
5. Drop a numeric range
num_range is more EXPLICIT than matches("^q\\d+$"). With num_range you specify exactly which numbers; with matches you'd accept any digit. For known fixed ranges, num_range is safer.num_range() vs matches() vs starts_with()
| Helper | Precision | Best for |
|---|---|---|
num_range("q", 1:5) |
Exact range | Known numeric suffixes |
matches("^q\\d+$") |
Any digit | Variable-length suffixes |
starts_with("q") |
Any "q*" | Imprecise, may include qa/qb |
When to use which:
- num_range for KNOWN ranges with sequential integers.
- matches for unknown numbers of unknown widths.
- starts_with for general prefix matching.
A practical workflow
Use num_range for survey questions or time-series with structured names.
Common pitfalls
Pitfall 1: width parameter must match. If columns are q01, q02, ..., use num_range("q", 1:9, width = 2). Without width, "q01" doesn't match "q" + 1.
Pitfall 2: skipped numbers. num_range("q", c(1, 3, 5)) works for non-contiguous. Just pass any integer vector.
num_range() requires the suffix to be EXACTLY numeric, no other characters. "q1_score" is NOT matched by num_range("q", 1:10). For mixed patterns, use matches.Try it yourself
Try it: Build a tibble with columns q1 through q5 and select only q2, q3, q4. Save to ex_mid.
Click to reveal solution
Explanation: num_range with range 2:4 picks q2, q3, q4 exactly.
Related tidyselect helpers
After mastering num_range, look at:
starts_with()/ends_with()/contains()/matches(): name-basedeverything(): all remainingwhere(): predicateall_of()/any_of(): explicit name vector
For irregular numeric patterns (e.g., q1, q2, q5, q10), num_range with a custom integer vector handles non-contiguous ranges.
FAQ
What does num_range do in dplyr?
num_range(prefix, range, width) selects columns whose names are prefix followed by an integer from range. Optionally zero-padded to width digits.
How do I match q01, q02, ... q10 with num_range?
Pass width = 2: num_range("q", 1:10, width = 2). width pads with leading zeros.
What is the difference between num_range and matches?
num_range is for KNOWN numeric ranges. matches uses regex for unknown / variable patterns. num_range is more explicit and safer for fixed ranges.
Can num_range handle non-contiguous integers?
Yes. num_range("q", c(1, 3, 5)) selects q1, q3, q5 only.
What if my prefix has special regex characters?
num_range treats the prefix as LITERAL, so "q." is fine (matches "q." prefix). For regex, use matches.