tibble tibble() in R: Build Tibbles Column by Column
The tibble() function in the tibble package builds modern data frames column-by-column from named vectors, with stricter rules and cleaner printing than data.frame().
tibble(x = 1:3, y = c("a","b","c")) # by column
tibble(x = 1:3, y = x * 2) # refer to prior col
tibble(x = 1, y = 1:4) # length-1 recycles
tibble(.rows = 5) # empty with N rows
tibble(`bad col` = 1:3) # back-ticked names
tibble(z = list(1:2, 1:3, 1:5)) # list-column
as_tibble(mtcars, rownames = "car") # promote a data.frameNeed explanation? Read on for examples and pitfalls.
What tibble() does in one sentence
tibble() builds a data frame column-by-column. You pass named vectors of the same length (or length 1 for recycling), and the function returns a tibble: a stricter, better-printing variant of data.frame. Unlike base R, it never converts strings to factors, never matches column names by prefix, and never auto-creates row names.
When you reach for tibble(), you usually want one of three things. Build a small data frame inline from raw vectors. Prototype a wider transformation step. Or hand off cleaner output to ggplot2 and dplyr. All three are first-class, which is why the tidyverse uses tibbles as its default frame type.
Syntax
tibble() accepts named expressions and a few control arguments. Each named expression becomes a column. Later columns can refer to earlier ones in the same call.
The full signature is:
tibble(..., .rows = NULL, .name_repair = "check_unique")
...is one or morename = vectorpairs..rowssets the row count when no vectors are supplied (for empty-shell frames)..name_repaircontrols how duplicate or empty column names are handled:"check_unique"(default, errors on duplicates),"unique"(deduplicates),"minimal"(keeps as-is), or"universal"(also fixes syntax).
The return value has class c("tbl_df", "tbl", "data.frame"), so any function expecting a data frame still works on it.
tibble(), later arguments can use names defined earlier in the same call: tibble(x = 1:3, y = x * 2) is valid. This works because tibble evaluates arguments lazily, in order, in the local frame.Six common patterns
1. Build from named vectors
2. Use a prior column inside the same call
This sequential evaluation is the most-cited reason to prefer tibble() over data.frame().
3. Recycle length-1 values
group = "A" is recycled to length 4 to match the other columns. Any other length mismatch (length 2 paired with length 4, for example) errors.
4. List-columns for nested data
List-columns are first-class in tibbles. They cleanly hold ragged data, nested models, or any object you want stored per row.
5. Empty shell with a row count
Useful when you want to assemble columns one at a time with add_column() or mutate().
6. Allow non-syntactic column names
tibble() keeps these names exactly. data.frame() would mangle them via make.names() unless you pass check.names = FALSE.
tibble() vs data.frame()
tibble() enforces stricter, more predictable rules. That predictability is the real value. No surprise type conversions, no silent recycling, no prefix matching, no row-name magic. The cost is a small learning curve for anyone arriving from base R workflows.
| Behavior | tibble() |
data.frame() |
|---|---|---|
| Strings to factors | Never | Was default before R 4.0 |
| Column names | Preserved | Mangled by make.names() |
| Recycling | Length 1 only | Any divisor of N |
| Refer to prior col | Allowed | Errors |
| Row names | None | Always present |
| Partial matching | No | df$na matches name |
| Printing | First 10 rows, fitted width | Full frame, all columns |
When to use which:
- Use
tibble()for tidyverse pipelines, interactive analysis, and any code that will be read by someone else later. - Use
data.frame()when you need legacy compatibility (older packages that rely on row names) or zero tidyverse dependencies.
Common pitfalls
Pitfall 1: vector length mismatch. tibble() recycles only length-1 vectors. Anything else errors.
Pitfall 2: forgetting as_tibble() for existing frames. tibble() builds fresh from vectors. To promote a data.frame to a tibble, use as_tibble().
tibble(x = 1, x = 2) raises an error because .name_repair = "check_unique" blocks it. Pass .name_repair = "minimal" to keep duplicates intentionally, or "unique" to auto-suffix them.Pitfall 3: expecting row names. Tibbles do not carry row names. If your workflow depends on them (heatmaps keyed off rownames, for example), use rownames_to_column() first to preserve them as a real column before converting.
Try it yourself
Try it: Build a tibble named ex_grades with three students (Aki, Mei, Sol), their scores (88, 75, 92), and a grade column computed as "A" if score is at least 85, else "B".
Click to reveal solution
Explanation: Inside tibble(), the grade column can reference score because columns evaluate in declaration order. ifelse() is vectorized, so it produces one value per row.
Related tibble functions
After mastering tibble(), look at:
as_tibble(): convert a data.frame, matrix, or list into a tibble.tribble(): build a tibble row-by-row, useful for fixed lookup tables.enframe(): convert a named vector into a two-column tibble.add_row()andadd_column(): append rows or columns to an existing tibble.glimpse(): print a tibble structure horizontally, one line per column.
For the package as a whole, the official tibble reference lists every constructor, helper, and option.
FAQ
What is the difference between tibble and data.frame in R?
A tibble is a data.frame with stricter rules and nicer printing. It never converts strings to factors, never mangles column names, never auto-creates row names, and only recycles length-1 vectors. Printing shows the first 10 rows and only the columns that fit the console width, with a class summary at the top. Every tibble inherits from data.frame, so any function that expects a data.frame still works.
Do I need the tibble package or is it in tidyverse?
Both work. library(tibble) loads only tibble(), as_tibble(), and related helpers. library(tidyverse) loads tibble along with dplyr, ggplot2, tidyr, and the rest in one call. For lightweight scripts or package code, importing only tibble keeps the dependency footprint small.
How do I convert a data.frame to a tibble?
Use as_tibble(): as_tibble(mtcars). Pass rownames = "name_of_col" to keep row names as a real column: as_tibble(mtcars, rownames = "car"). The reverse direction, as.data.frame(tbl), drops the tibble class, but you rarely need it because tibbles already behave as data frames.
Can a tibble have duplicate column names?
By default no. tibble() uses .name_repair = "check_unique", which errors on duplicates. Pass .name_repair = "minimal" to keep duplicates intentionally (useful when you round-trip data from a system that allows them), or "unique" to auto-suffix the dupes.
Why does tibble print only 10 rows?
To prevent runaway console output. Use print(tbl, n = 100) to show more, or set per-session defaults with options(tibble.print_max = 50, tibble.print_min = 20). The compact default is one of the main reasons interactive users prefer tibbles over base data frames.