glue glue_data() in R: String Templates for Data Frames
glue glue_data() interpolates a template across every row of a data frame, returning one string per row. You name each column inside {braces} and glue_data() looks up the column by name and substitutes the value. It is the row-wise counterpart to glue(), built for mutate(), reports, and bulk message generation.
glue_data(mtcars, "{rownames(mtcars)}: {mpg} mpg") # one string per row
mtcars |> glue_data("{mpg} mpg, {cyl} cyl") # pipe friendly
glue_data(df, "Hi {name}, age {age}") # named columns
glue_data(df, "{name}-{format(date, '%Y')}.csv") # expressions on columns
df |> mutate(label = glue_data(cur_data(), "{x}")) # inside mutate
glue_data(df, "{x}", .na = "missing") # custom NA handling
glue_collapse(glue_data(df, "{x}"), sep = ", ") # one final stringNeed explanation? Read on for examples and pitfalls.
What glue_data() does in one sentence
glue_data() evaluates a template inside the scope of a data frame, returning one interpolated string per row. Every {column} slot is looked up in the data frame first, then in the calling environment, and the result is a glue vector with nrow(df) elements.
This is row-wise string interpolation. You write the sentence once and glue_data() loops it across rows, so what you get back lines up with the data frame perfectly. It is the natural fit for labels, file names, and any per-row message.
Each row of df produces one finished sentence, and the result keeps the row order of the input.
Syntax
glue_data(.x, ..., .sep = "", .envir = parent.frame(), .na = "NA", .null = character()) interpolates a template across rows. The .x argument is the data frame (or list), ... are the template strings, .sep joins multiple templates, .envir is the fallback scope when a {name} is not a column, and .na controls how missing values render.
Anything inside {} is treated as R code evaluated against the data frame, not just a column name. That means you can compute on columns directly in the template.
Both {qty}, {price}, and {qty * price} are evaluated row by row, with each column treated as a vector of length 1 inside the template scope.
glue("{df$col}"), switch to glue_data(df, "{col}") for cleaner code.Five common glue_data() scenarios
Five patterns cover almost every real use of glue_data(). Each block runs on its own, so you can paste it straight into the live console.
Build a row label or message per record
The core job of glue_data() is producing one string per row of a data frame. It is ideal for labels, log lines, and email-style messages.
The result is a length-3 glue vector aligned with the rows of orders, ready to print, write to a file, or feed into another function.
Use inside a dplyr mutate to add a label column
glue_data() pairs with mutate() to create derived text columns. Inside a dplyr pipeline you usually use glue() directly because mutate already evaluates expressions in the data scope, but glue_data() is the explicit form when you have a stand-alone data frame.
Inside mutate(), columns are already in scope, so glue() does the same job. Reach for glue_data() when the data frame is not the active context, like when you pass it explicitly to a helper.
Generate file names or paths from rows
Row-driven file naming is a classic glue_data() use case. Combine column values into a slug-style name and write each row to its own file.
The format() call runs row by row, producing a consistent date stamp for every output path.
Handle missing values explicitly
The .na argument controls how NA values render. Without it, NAs print as the literal string "NA", which can look messy in user-facing output.
Pick a placeholder that fits the context: "unknown" for reports, "-" for tables, or "" to drop the slot.
Collapse the per-row vector to one string
Pair glue_data() with glue_collapse() to fold the per-row vector into a single string. Useful for summaries, bullet lists, and prompts.
The intermediate bullets vector is one row per element. glue_collapse() joins them with newlines into a single multi-line string ready for cat().
glue_data() vs glue()
Use glue_data() when the values live in a data frame; use glue() when they live as separate objects. Both functions share the same brace syntax, but their lookup rule is different.
| Question | glue() | glue_data() |
|---|---|---|
| First argument | template string | data frame or list |
| Lookup order | .envir only |
.x first, then .envir |
| Result length | length of longest brace value | nrow(.x) (always row count) |
| Best for | one-off messages | row labels, reports, file names |
| Inside dplyr mutate | preferred | rarely needed (mutate already scopes the data) |
If a template references columns of a data frame, glue_data() is the explicit, readable choice. If it references plain variables, glue() is shorter.
df$col inside a glue() template. Replacing glue("{df$col}-{df$x}") with glue_data(df, "{col}-{x}") is shorter, faster, and reads like the template you would write by hand on paper.Common pitfalls
Three pitfalls account for almost every glue_data() error. Each has a one-line fix.
Passing a non-tabular object. glue_data() expects a data frame, list, or environment. Atomic vectors fail because the function tries to look up names inside them.
Convert to a list (or one-row data frame) and the same template works.
Forgetting that result length equals row count. The output is always a vector of length nrow(.x), even when the template ignores some columns. If you wanted one summary string, follow up with glue_collapse().
Column name shadowed by a same-named variable in scope. When .x has a column called name and the calling environment also has a variable name, glue_data() uses the column. That is usually the intent, but it can surprise you if you meant the outer variable. Rename one of them to remove the ambiguity.
Try it yourself
Try it: Use glue_data() to build a one-line label per row of mtcars containing the row name and miles per gallon, for the first 3 rows. Save the result to ex_labels.
Click to reveal solution
Explanation: Row names are not a column of mtcars, so we copy them into a car column first. glue_data() then looks up car and mpg in the data frame and returns one string per row.
Related glue functions
- glue(): interpolate variables that live as separate objects in scope.
- glue_collapse(): fold a glue vector into a single string with a separator.
- glue_sql(): row-wise interpolation that quotes values safely for SQL.
- glue_safe(): like glue() but errors clearly when a
{name}is missing. - str_glue_data(): the stringr alias for glue_data() with identical behaviour.
For the full reference, see the glue package documentation.
FAQ
What is the difference between glue() and glue_data() in R?
glue() looks up brace names in the calling environment only, while glue_data() looks them up in a data frame (or list) first and falls back to the environment. Use glue_data() when your values live as columns of a data frame; use glue() when they live as separate variables. They share every other argument, including .sep, .na, and the brace delimiters.
How do I use glue_data() inside a dplyr pipeline?
You can pipe a data frame straight into glue_data() with df |> glue_data("{col}"), which returns a character vector. Inside mutate(), columns are already in scope, so a plain glue() call works there too: mutate(label = glue("{col1}-{col2}")). Reach for glue_data() when the pipeline returns a data frame you want to label outside of mutate().
Why does glue_data() return multiple strings instead of one?
glue_data() returns one string per row of the input data frame, so the result length always equals nrow(.x). That is the function's job: produce a row-aligned vector. If you want a single combined string, pass the result to glue_collapse() with a separator like ", " or "\n".
Can I use expressions like sum() or paste0() inside glue_data() braces?
Yes. Anything inside {} is treated as R code evaluated against the data frame, so {toupper(name)}, {round(score, 1)}, and {format(date, '%Y')} all work. The columns become vectors of length 1 inside each row's evaluation, so column-wise functions like sum() collapse correctly.
How do I handle NA values when interpolating with glue_data()?
Pass the .na argument to control the placeholder for missing values. The default is the literal "NA", which prints in the template; setting .na = "unknown" or .na = "-" replaces every NA with that text. Set .na = NULL to keep the NA propagation behaviour where any NA in a slot returns NA for the whole string.