stringr Exercises in R: 50 Real Practice Problems

Fifty practice problems on stringr in R: detect, extract, replace, split, count, pad, regex. Real scenarios with hidden solutions.

RRun this once before any exercise
library(stringr) library(dplyr) library(tibble) library(stringi)

  

Section 1. Detect and match (8 problems)

Exercise 1.1: Detect substring

Difficulty: Beginner. Filter emails containing "gmail.com".

Show solution
RInteractive R
emails <- c("a@gmail.com","b@yahoo.com","c@gmail.com") emails[str_detect(emails, "gmail.com")]

  

Exercise 1.2: Detect at start

Difficulty: Beginner. Strings starting with "Mr ".

Show solution
RInteractive R
x <- c("Mr Smith","Dr Jones","Mr Lee","Mrs Park") x[str_detect(x, "^Mr ")]

  

Exercise 1.3: Detect at end

Difficulty: Beginner. Filenames ending with ".csv".

Show solution
RInteractive R
f <- c("a.csv","b.txt","c.csv","d.tsv") f[str_detect(f, "\\.csv$")]

  

Exercise 1.4: Count matches

Difficulty: Intermediate. Count vowels per word.

Show solution
RInteractive R
str_count(c("apple","banana","sky"), "[aeiou]")

  

Exercise 1.5: Position of match

Difficulty: Intermediate. Locate first digit position in each string.

Show solution
RInteractive R
str_locate(c("abc123","x9y","none"), "\\d")

  

Exercise 1.6: Match case-insensitive

Difficulty: Intermediate. Detect "ERROR" anywhere, ignoring case.

Show solution
RInteractive R
logs <- c("Error: bad input","INFO: ok","error: fatal") str_detect(logs, regex("error", ignore_case = TRUE))

  

Exercise 1.7: Multi-word match

Difficulty: Intermediate. Filter rows where description contains BOTH "fast" AND "easy".

Show solution
RInteractive R
desc <- c("fast and easy","slow but reliable","easy and fast","quick") desc[str_detect(desc, "fast") & str_detect(desc, "easy")]

  

Exercise 1.8: Filter rows of a tibble

Difficulty: Intermediate. From a tibble, keep rows whose name starts with "A".

Show solution
RInteractive R
tibble(name = c("Alice","Bob","Anna","Carol")) |> filter(str_detect(name, "^A"))

  

Section 2. Extract (8 problems)

Exercise 2.1: First match

Difficulty: Intermediate. Extract the first 3-digit number from each string.

Show solution
RInteractive R
str_extract(c("abc123","45-678","xy"), "\\d{3}")

  

Exercise 2.2: All matches

Difficulty: Intermediate. Extract ALL numbers per string.

Show solution
RInteractive R
str_extract_all(c("a1b2","x10y20","none"), "\\d+")

  

Exercise 2.3: Capture groups

Difficulty: Advanced. From "user_42@x.com", capture "user" prefix and "42" id.

Show solution
RInteractive R
str_match("user_42@x.com", "([a-z]+)_(\\d+)")

  

Exercise 2.4: Extract email domain

Difficulty: Intermediate. Get the domain part of an email.

Show solution
RInteractive R
str_extract(c("a@x.com","b@y.org"), "(?<=@).+")

  

Exercise 2.5: Extract phone area code

Difficulty: Intermediate. From "(415) 555-1234".

Show solution
RInteractive R
str_extract("(415) 555-1234", "\\d{3}")

  

Exercise 2.6: Extract dollars

Difficulty: Intermediate. From "Total: $123.45 paid".

Show solution
RInteractive R
str_extract("Total: $123.45 paid", "\\$\\d+\\.\\d{2}")

  

Exercise 2.7: Extract URL hostname

Difficulty: Advanced. From "https://r-statistics.co/path".

Show solution
RInteractive R
str_extract("https://r-statistics.co/path", "(?<=://)[^/]+")

  

Exercise 2.8: Multiple groups to columns

Difficulty: Advanced. From "John Smith, 30", extract first, last, age.

Show solution
RInteractive R
m <- str_match("John Smith, 30", "(\\w+) (\\w+), (\\d+)") m

  

Section 3. Replace and modify (8 problems)

Exercise 3.1: Replace first match

Difficulty: Beginner. Replace first digit with "*".

Show solution
RInteractive R
str_replace("abc123def", "\\d", "*")

  

Exercise 3.2: Replace all matches

Difficulty: Beginner. Replace all digits with "*".

Show solution
RInteractive R
str_replace_all("abc123def", "\\d", "*")

  

Exercise 3.3: Strip non-digits

Difficulty: Intermediate. Normalize phone to digits only.

Show solution
RInteractive R
str_replace_all("(415) 555-1234", "\\D", "")

  

Exercise 3.4: Replace with backreference

Difficulty: Advanced. Reformat "John Smith" to "Smith, John".

Show solution
RInteractive R
str_replace("John Smith", "(\\w+) (\\w+)", "\\2, \\1")

  

Exercise 3.5: Replace with named groups

Difficulty: Advanced. Same swap using named groups.

Show solution
RInteractive R
str_replace("John Smith", "(?<first>\\w+) (?<last>\\w+)", "\\2, \\1")

  

Exercise 3.6: Trim whitespace

Difficulty: Beginner. Remove leading/trailing spaces.

Show solution
RInteractive R
str_trim(c(" hello "," world ","ok"))

  

Exercise 3.7: Squish whitespace

Difficulty: Intermediate. Collapse internal multi-spaces too.

Show solution
RInteractive R
str_squish(" hello world ok ")

  

Exercise 3.8: Remove punctuation

Difficulty: Intermediate. Strip all punctuation.

Show solution
RInteractive R
str_replace_all("Hello, world! Yes? No.", "[[:punct:]]", "")

  

Section 4. Split and combine (8 problems)

Exercise 4.1: Split by delimiter

Difficulty: Beginner. Split "a,b,c" on comma.

Show solution
RInteractive R
str_split("a,b,c", ",")

  

Exercise 4.2: Split with simplify

Difficulty: Intermediate. Return a matrix.

Show solution
RInteractive R
str_split(c("a,b","c,d","e,f"), ",", simplify = TRUE)

  

Exercise 4.3: Split into n parts

Difficulty: Intermediate. Split into max 2 pieces.

Show solution
RInteractive R
str_split("a,b,c,d", ",", n = 2)

  

Exercise 4.4: Concatenate vector

Difficulty: Beginner. Join words with " ".

Show solution
RInteractive R
str_c(c("R","is","fun"), collapse = " ")

  

Exercise 4.5: Vectorized concatenate

Difficulty: Intermediate. Join two vectors element-wise.

Show solution
RInteractive R
str_c(c("Hello","Hi"), c("Alice","Bob"), sep = " ")

  

Exercise 4.6: Glue-style interpolation

Difficulty: Intermediate. Use stringr's str_glue for interpolation.

Show solution
RInteractive R
name <- "Alice"; age <- 30 str_glue("Hello {name}, age {age}")

  

Exercise 4.7: Pad to fixed width

Difficulty: Beginner. Zero-pad ID to 6 chars.

Show solution
RInteractive R
str_pad("42", width = 6, side = "left", pad = "0")

  

Exercise 4.8: Truncate with ellipsis

Difficulty: Intermediate. Truncate to 10 chars with "...".

Show solution
RInteractive R
str_trunc("This is a long sentence", width = 10)

  

Section 5. Case and length (8 problems)

Exercise 5.1: To lower

Difficulty: Beginner. Lowercase a string.

Show solution
RInteractive R
str_to_lower("HELLO WORLD")

  

Exercise 5.2: To upper

Difficulty: Beginner. Uppercase.

Show solution
RInteractive R
str_to_upper("hello")

  

Exercise 5.3: To title

Difficulty: Beginner. Title case.

Show solution
RInteractive R
str_to_title("hello world")

  

Exercise 5.4: To sentence

Difficulty: Intermediate. Capitalize first letter only.

Show solution
RInteractive R
str_to_sentence("hello world. how are you?")

  

Exercise 5.5: String length

Difficulty: Beginner. Length of each.

Show solution
RInteractive R
str_length(c("R","stringr","x"))

  

Exercise 5.6: Reverse a string

Difficulty: Intermediate. Reverse character order.

Show solution
RInteractive R
stringi::stri_reverse("hello")

  

Exercise 5.7: Substring by position

Difficulty: Intermediate. Get characters 2-4.

Show solution
RInteractive R
str_sub("abcdefg", 2, 4)

  

Exercise 5.8: Replace substring by position

Difficulty: Advanced. Replace chars 2-4 with "XX".

Show solution
RInteractive R
x <- "abcdefg" str_sub(x, 2, 4) <- "XX" x

  

Section 6. Real workflows (10 problems)

Exercise 6.1: Validate email format

Difficulty: Intermediate. Detect strings that look like emails.

Show solution
RInteractive R
emails <- c("a@x.com","not_an_email","b@y.co.uk","c@") str_detect(emails, "^[\\w.]+@[\\w.]+\\.\\w{2,}$")

  

Exercise 6.2: Extract hashtags

Difficulty: Intermediate. From a tweet, extract all #hashtags.

Show solution
RInteractive R
str_extract_all("Loving #rstats and #dataviz today", "#\\w+")

  

Exercise 6.3: Parse a structured log line

Difficulty: Advanced. From "2024-01-15 ERROR [auth] timeout" extract date, level, module, msg.

Show solution
RInteractive R
log <- "2024-01-15 ERROR [auth] timeout" str_match(log, "(\\d{4}-\\d{2}-\\d{2}) (\\w+) \\[(\\w+)\\] (.+)")

  

Exercise 6.4: Clean phone numbers

Difficulty: Intermediate. Normalize "(415) 555-1234" to "+14155551234".

Show solution
RInteractive R
str_c("+1", str_replace_all("(415) 555-1234", "\\D", ""))

  

Exercise 6.5: Detect duplicates by normalized name

Difficulty: Intermediate. After lowercasing+trimming names, find duplicates.

Show solution
RInteractive R
names <- c(" Alice ","BOB","alice","carol","Bob ") norm <- str_to_lower(str_trim(names)) duplicated(norm)

  

Exercise 6.6: Extract first sentence

Difficulty: Advanced. Pull the first sentence (up to ".", "!", or "?").

Show solution
RInteractive R
str_extract("Hello world. How are you? I'm great!", "[^.!?]+[.!?]")

  

Exercise 6.7: Strip HTML tags

Difficulty: Advanced. Remove <...> tags from a string.

Show solution
RInteractive R
str_replace_all("<p>Hello <b>world</b></p>", "<[^>]+>", "")

  

Exercise 6.8: Count words

Difficulty: Intermediate. Count words in a sentence.

Show solution
RInteractive R
str_count("Hello world from R", "\\b\\w+\\b")

  

Exercise 6.9: Split sentences

Difficulty: Advanced. Split paragraph into sentences.

Show solution
RInteractive R
str_split("Hi there. How are you? I am fine.", "(?<=[.!?])\\s+")

  

Exercise 6.10: Mask sensitive info

Difficulty: Advanced. Replace credit card numbers (16 digits) with X.

Show solution
RInteractive R
str_replace_all("Card: 4111-1111-1111-1234 expires 01/26", "\\d{4}-\\d{4}-\\d{4}-\\d{4}", "XXXX-XXXX-XXXX-XXXX")

  

What to do next

  • Regex-Exercises-in-R (coming), pure regex drilling.
  • tidyverse-Exercises (shipped), string ops in pipelines.
  • Data-Cleaning-Exercises (coming), strings as part of cleanup.