stringr str_glue_data() in R: Interpolate From a Data Frame
stringr str_glue_data() interpolates a string template using a data frame or list as the lookup source: every {expression} inside the template is evaluated against that object's columns or elements. It is str_glue() with the data passed in explicitly, which makes it the natural fit for pipelines.
str_glue_data(df, "{col}") # interpolate a data frame column
str_glue_data(lst, "{name}") # interpolate a named list element
str_glue_data(df, "{a} and {b}") # combine two columns
df |> str_glue_data("{x}") # pipe the data straight in
str_glue_data(df, "{round(price, 1)}") # run code on a column
str_glue_data(df, "{toupper(name)}") # call a function per row
str_glue_data(mtcars, "{mpg} mpg") # works on any data frameNeed explanation? Read on for examples and pitfalls.
What str_glue_data() does in one sentence
str_glue_data() fills a {brace} template by evaluating each expression against a supplied data frame, list, or environment. You hand it the data as the first argument, .x, and every name inside the braces is looked up there before anywhere else.
This is the difference from str_glue(): str_glue() reads variables from wherever the call happens, while str_glue_data() reads them from the object you pass. When that object is a data frame, the function is vectorised over its rows and returns one string per row.
The {name} slot resolves to the name column, so a two-row data frame produces two finished strings.
Syntax
str_glue_data(.x, ..., .sep = "", .envir = parent.frame()) builds strings from a data source and templates. The .x argument is the lookup object, ... holds the template strings, .sep joins them when you pass more than one, and .envir is the fallback environment for any name not found in .x.
Anything inside {} is treated as R code, not just a bare column name. Arithmetic, function calls, and comparisons all run before the string is assembled.
glue_data(), including custom delimiters and the transformer hook, works through str_glue_data() as well.Five ways to use str_glue_data()
Five patterns cover almost every real use of str_glue_data(). Each block stands alone, so you can paste it straight into the live console.
Interpolate columns from a data frame
The core job is reading several columns into one sentence per row. Each {} slot names the column that belongs there.
The $ is a literal character; only the {item} and {price} slots are interpolated, once per row.
Interpolate from a named list
A named list works the same way as a data frame. Each element name becomes available inside the braces.
Because the list elements are scalars, the result is a single string rather than a vector.
Pipe the data straight in
str_glue_data() takes the data as its first argument, so it slots into a pipe. You never have to repeat the data frame name inside the template.
The data frame flows in through the pipe, and each column name is read directly from it.
{df$mpg} and repeat df for every column. str_glue_data() takes the data once, so the template stays clean: bare column names, no df$ prefix. That single design choice is the entire reason the function exists.Run expressions against the data
Braces hold any R expression, so you can transform a column inline. The expression is evaluated with the data as its scope.
mean(values) and max(values) - min(values) both run before the template is filled.
Build a label for every row
Combining several columns into a label column is the most common pipeline use. Pass the columns to str_glue_data() and you get a vector ready to assign.
The template fills once per row, producing a label string aligned with the data frame.
str_glue_data() vs str_glue()
Both functions interpolate {brace} templates; they differ in where the names come from. str_glue() reads from the calling environment, while str_glue_data() reads from the object you pass as .x.
Both return the same text. str_glue_data() keeps the template free of the df$ prefix, which matters most when a template references many columns.
| Function | Name lookup | Best for |
|---|---|---|
str_glue_data() |
from the .x object |
data frames and lists, pipelines |
str_glue() |
from the calling environment | loose variables in scope |
mutate(... str_glue()) |
column vectors in scope | building a column in a dplyr chain |
mutate(df, label = str_glue("{name}")) works without str_glue_data(). Reach for str_glue_data() when the data is not already in a tidyverse verb, for example at the start of a pipe or in a plain script.Common pitfalls
Three pitfalls cause most str_glue_data() surprises. Each has a one-line fix.
A name missing from .x raises an error
Every name inside the braces must resolve, or str_glue_data() stops. If the name is not a column of .x and not found in .envir, you get the familiar "object not found" error.
Add the missing column to the data, or correct the name in the template.
str_glue_data() returns a glue object
The result has class glue, not a bare character vector. Most code treats it as a string, but strict type checks can trip on the extra class attribute.
Wrap the result in as.character() whenever a function or test demands a plain character vector.
Unmatched names fall back to the environment
A name not in .x is looked up in .envir instead of failing immediately. This is useful for mixing a constant into the template, but it can also pick up a stale variable silently.
Here price is a column and threshold is an outside variable, both resolved in one template.
.x.Try it yourself
Try it: Use str_glue_data() to turn the cities data frame into the lines "Tokyo has 14 million people." and "Paris has 2 million people." Save the result to ex_lines.
Click to reveal solution
Explanation: str_glue_data() evaluates each {} slot against the cities data frame, so {city} reads the city column and {pop} reads the pop column. The template fills once per row, returning one string per city.
Related stringr functions
When str_glue_data() is not quite the fit, these are the next stops:
- str_glue() interpolates from loose variables in scope rather than a data object.
- str_c() joins fixed pieces element-wise when there is no template to fill.
- str_flatten() collapses a vector into a single string with a separator.
- str_pad() grows a string to a fixed width by adding a pad character.
- str_replace() swaps a matched pattern for new text, a different kind of edit.
- The full stringr reference documents str_glue_data() and its arguments.
FAQ
What is the difference between str_glue_data() and str_glue() in R?
They differ in where brace names are resolved. str_glue() looks up each {} name in the calling environment, so you write {df$score} to reach a column. str_glue_data() takes a data frame or list as its first argument, .x, and resolves names against that object, so the template stays clean with bare names like {score}. Use str_glue_data() when the names live in a data object, and str_glue() when they are loose variables.
How do I use str_glue_data() with a data frame?
Pass the data frame as the first argument and write column names inside braces: str_glue_data(df, "{name}: {score}"). The function is vectorised over rows, so it returns one finished string per row. Because the data frame goes in first, str_glue_data() also works at the start of a pipe, such as df |> str_glue_data("{name}"), with no need to repeat the data frame name in the template.
Can str_glue_data() interpolate from a list?
Yes. A named list works exactly like a data frame: each element name becomes available inside the braces. str_glue_data(list(name = "Ada"), "{name}") returns "Ada". If the list elements are scalars the result is a single string, and if they are vectors of equal length the template is recycled to produce one string per element.
Why does str_glue_data() return a glue object instead of a string?
str_glue_data() returns an object of class glue, which also inherits character. Most R code treats it as an ordinary string, so printing, concatenation, and column assignment all behave normally. The extra class only matters for strict type checks: identical() against a plain string returns FALSE. Wrap the result in as.character() when a function or unit test requires a bare character vector.