stringr str_to_lower() in R: Lowercase Strings With Locale
stringr str_to_lower() converts every element of a character vector to lowercase. It is vectorised, NA aware, and locale aware via the underlying stringi engine, which makes it more predictable than base R tolower() across platforms and Unicode inputs.
str_to_lower(x) # default English locale str_to_lower(c("Hello", "WORLD")) # vector input str_to_lower(x, locale = "tr") # Turkish dotless i rules str_to_lower(NA_character_) # NA, not "na" (NA-safe) str_to_lower(c("Resume", "RESUME")) # normalize for matching df |> mutate(name = str_to_lower(name)) # lowercase a column str_to_lower("CAFE") # diacritics untouched str_to_lower(c("ABC", "Mixed", "")) # empty string stays ""
Need explanation? Read on for examples and pitfalls.
What str_to_lower() does in one sentence
str_to_lower(string, locale = "en") returns a copy of the input with every character mapped to its lowercase equivalent. It works element-wise on a character vector, propagates NA inputs as NA outputs, leaves digits and punctuation unchanged, and uses Unicode-aware rules drawn from the stringi package so the result is identical on Windows, macOS, and Linux.
Use str_to_lower() whenever you need a normalized form of text, for example when comparing free-text user input to a lookup table, or when preparing values for case-insensitive joins.
The output keeps the original length, NA stays NA, and the empty string is returned unchanged.
Syntax
str_to_lower(string, locale = "en") takes two arguments. The first is the character vector you want to transform; the second is an ISO 639 language code that selects locale-specific rules. The default "en" covers most ASCII and Latin text.
Because str_to_lower() is vectorised, you can apply it to thousands of strings in one call without writing a loop.
Every element is processed independently and the original order is preserved, so str_to_lower() drops cleanly into a mutate() or sapply() pipeline.
locale argument. Use str_to_lower() in production code that runs on multiple machines.Five common str_to_lower() scenarios
Five scenarios cover almost every real use of str_to_lower(). Each block stands alone so you can paste it into the live console.
Normalize free-text input for matching
User input rarely arrives in a consistent case. Lowercasing both sides of a comparison removes trivial mismatches without losing any information.
The four typed values vary in case, but %in% returns TRUE wherever the lowercased input matches an entry in the lookup vector.
Lowercase a column inside a data frame
Most cleanup happens inside a tidyverse pipeline. Combine str_to_lower() with mutate() to rewrite a column in place.
Working on a copy column (species_lc) keeps the original factor untouched, which is helpful when the original case carries a label you want to preserve elsewhere.
Deduplicate a vector ignoring case
"APPLE" and "apple" are usually the same record. Lowercase before deduplicating to collapse spurious duplicates.
unique() on the lowercased vector returns three tags. Apply this pattern before any join or summary that should be case-insensitive.
Locale-aware lowercasing for non-English text
The Turkish locale treats "I" and "i" differently from English. The default "en" rules give you a dotted i; "tr" produces the dotless ı.
Pick the locale that matches your data, or stick with "en" when you want stable cross-platform output for ASCII-dominant text.
Combine with whitespace cleanup
Real text usually needs both case normalization and whitespace trimming. Chain str_to_lower() with str_squish() for a one-line cleanup.
str_squish() strips leading and trailing whitespace and collapses runs of internal whitespace to single spaces. The order matters only for performance; the result is identical either way.
str_to_lower() vs tolower() vs str_to_upper() vs str_to_title()
Four functions look similar but solve different problems. Picking the wrong one usually shows up as inconsistent output across platforms or as overly aggressive casing.
| Function | Source | Locale aware? | NA safe? | Best for |
|---|---|---|---|---|
str_to_lower(x) |
stringr / stringi | yes (locale = "en" default) |
yes (NA in, NA out) | tidyverse code, cross-platform output |
tolower(x) |
base R | system-dependent | yes | base-only scripts, ASCII-only data |
str_to_upper(x) |
stringr / stringi | yes | yes | uppercasing for display or codes |
str_to_title(x) |
stringr / stringi | yes | yes | proper-noun formatting |
Reach for str_to_lower() in pipelines that ship to multiple machines, tolower() in quick base R scripts, str_to_upper() when you need a uniform uppercase form, and str_to_title() when you want a Headline Style result.
Common pitfalls
Three pitfalls cause most str_to_lower() surprises. Each has a one-line fix.
Forgetting that NA stays NA
str_to_lower() does not silently coerce NA into the literal string "na". That is usually what you want, but it can break code that expects every element to be a real string.
If you need a placeholder, replace NA before lowercasing with replace_na(x, "") or coalesce(x, "").
Expecting it to strip accents or punctuation
Lowercasing changes case only. Diacritics, emoji, and punctuation pass through unchanged.
To remove accents at the same time, follow up with stringi::stri_trans_general(x, "Latin-ASCII"). To strip punctuation, follow with str_remove_all(x, "[[:punct:]]").
Comparing without lowercasing both sides
Lowercasing only one side of a comparison still produces case mismatches. The result looks subtly wrong rather than throwing.
If you control only one side, prefer str_detect(x, regex("apple", ignore_case = TRUE)) so the comparison is explicit about case insensitivity.
as.factor() or use forcats::fct_relabel(f, str_to_lower) if you need to keep factor levels.Try it yourself
Try it: Use the built-in state.name vector to produce lowercase, hyphenated state slugs (e.g. "new-york"). Save the result to ex_slugs.
Click to reveal solution
Explanation: str_to_lower() converts each name to lowercase, then str_replace_all() swaps any spaces for hyphens so multi-word names like "New York" become URL-safe slugs.
Related stringr functions
When str_to_lower() is not quite what you need, these are the next stops:
- str_to_upper() returns the uppercase form, useful for display codes and uniform headers.
- str_to_title() capitalizes the first letter of each word for headline-style output.
- str_to_sentence() capitalizes only the first letter of the entire string.
- str_detect() checks for pattern presence and accepts an
ignore_caseflag for case-insensitive matching. - str_squish() trims and collapses whitespace, often paired with str_to_lower() in cleanup chains.
- The full stringr reference documents every case-conversion helper.
FAQ
What is the difference between str_to_lower() and tolower() in R?
Both convert text to lowercase, but str_to_lower() uses the stringi engine and accepts an explicit locale argument, while tolower() depends on the system locale set by the operating system. That makes str_to_lower() preferable when your code runs on multiple machines or when the input contains non-English characters that need locale-specific rules, such as Turkish dotted and dotless i.
How do I lowercase a column in a data frame?
Inside a dplyr pipeline, use mutate() with str_to_lower(): df |> mutate(name = str_to_lower(name)). The function is vectorised, so it processes the entire column in one call without a loop. If the column is a factor, str_to_lower() returns a character vector; wrap with as.factor() or use forcats::fct_relabel(name, str_to_lower) to preserve factor structure.
Does str_to_lower() handle accented and Unicode characters?
Yes. str_to_lower() uses Unicode-aware case mapping from the stringi package, so accented Latin letters, Cyrillic, Greek, and similar scripts are lowercased correctly. The original accents are preserved; only the case changes. To strip diacritics at the same time, follow up with stringi::stri_trans_general(x, "Latin-ASCII").
Why do Turkish characters not lowercase the way I expect?
Turkish has both a dotless I (uppercase) that maps to dotless ı (lowercase) and a dotted I that maps to dotted i. The default "en" locale uses English rules, so "I" becomes "i". Pass locale = "tr" to str_to_lower() to get the Turkish behaviour where "I" becomes "ı". Always set the locale explicitly when you process multilingual text.
Can I use str_to_lower() for case-insensitive comparisons?
Yes, by lowercasing both sides of the comparison: str_to_lower(x) == str_to_lower(y). For pattern matching, an alternative is str_detect(x, regex("pattern", ignore_case = TRUE)), which avoids mutating the data and reads more clearly when only the comparison itself needs to be case-insensitive.