tidyr hoist() in R: Extract Specific List Column Elements
The hoist() function in tidyr extracts SPECIFIC named elements from a list column into new top-level columns. Unlike unnest_wider() which spreads ALL elements, hoist picks only what you ask for.
df |> hoist(json_col, name = "name", age = "age")
df |> hoist(json_col, city = list("address", "city")) # deep path
df |> hoist(json_col, n = "count", .remove = FALSE)
df |> unnest_wider(json_col) # different: ALL elementsNeed explanation? Read on for examples and pitfalls.
What hoist() does in one sentence
hoist(data, col, ...) extracts SPECIFIC named (or pathed) elements from a list column into new columns; the original list column is removed by default. Unlike unnest_wider, you specify exactly which fields to extract.
Syntax
hoist(data, col, ..., .remove = TRUE, .ptype = NULL). ... is new_col_name = "field_name" or new_col_name = list(path).
hoist when you only need a few fields from a deeply nested list column. unnest_wider extracts ALL fields, which is often more than you want for downstream analysis.Five common patterns
1. Extract two specific fields
2. Deep path via list
The list path navigates nested structures.
3. Keep the original list column
4. Specify type
5. Combine with unnest_wider for hybrid
hoist is more efficient than unnest_wider when you only need a few fields. unnest_wider creates a column for every name in every list cell; hoist creates only the columns you ask for. For wide JSON with 50+ fields and you only need 3, hoist is much cleaner.hoist() vs unnest_wider() vs purrr::map
| Function | Extracts | Best for |
|---|---|---|
hoist(col, x = "x") |
Specific named fields | A few fields from many |
unnest_wider(col) |
ALL named fields | Most fields needed |
purrr::map_chr(col, "x") |
One field, vector output | Quick scalar extraction |
When to use which:
- hoist for selective extraction with deep paths.
- unnest_wider when you want everything.
- purrr::map_* for one-off vector extraction.
A practical workflow
Use hoist for selective JSON parsing in API response pipelines.
Extract only the fields you need; ignore the rest of the JSON.
Common pitfalls
Pitfall 1: deep path syntax. Use list("a", "b", "c") for nested paths, not "a.b.c". Strings are field names; lists are path navigators.
Pitfall 2: missing fields. If a list cell lacks the requested field, the new column has NA for that row. Useful for sparse data.
hoist() with .remove = TRUE (default) removes the source list column. Pass .remove = FALSE if you want to do further extraction or keep the original.Try it yourself
Try it: Extract only "age" from a list column. Save to ex_age.
Click to reveal solution
Explanation: hoist extracts only the "age" field; the rest of info is discarded (because .remove = TRUE).
Related tidyr / purrr functions
After mastering hoist, look at:
unnest_wider(): extract all fieldsunnest_longer(): vectors to rowspurrr::map_chr()/map_dbl(): scalar extractionjsonlite::fromJSON(): JSON to R objectspluck(): deep list navigation
FAQ
What does hoist do in tidyr?
hoist(data, col, ...) extracts specific named elements from a list column into new columns. Unlike unnest_wider, it picks only what you specify.
What is the difference between hoist and unnest_wider?
hoist extracts SPECIFIC fields (you name them). unnest_wider extracts ALL fields. hoist is selective; unnest_wider is comprehensive.
How do I extract a deeply nested field with hoist?
Use list("path","to","field"): hoist(col, x = list("a", "b", "c")) navigates obj$a$b$c.
Does hoist remove the original list column?
By default yes (.remove = TRUE). Pass .remove = FALSE to keep it.
What happens if a field is missing in some cells?
That row's new column gets NA. hoist tolerates missing fields gracefully.