tibble add_column() in R: Append Columns to a Data Frame
The add_column() function in the tibble package adds one or more columns to an existing tibble using name-value pairs, with optional .before and .after arguments to control insert position.
add_column(df, z = 1:3) # append column at end add_column(df, z = 1:3, w = 0) # multiple columns add_column(df, z = 1:3, .before = "y") # insert before named col add_column(df, z = 1:3, .after = 1) # insert by index add_column(df, source = "manual") # scalar recycles df |> add_column(z = 1:3) # pipe-friendly add_column(df, x = 1:3, .name_repair = "unique") # repair on conflict
Need explanation? Read on for examples and pitfalls.
What add_column() does in one sentence
add_column() returns a new tibble with extra columns appended or inserted. You pass the source tibble plus one name-value pair per column to create. Each value must be a vector of length nrow(.data) or a scalar that recycles to fill the column. The original tibble is not modified; assignment back is required to persist the change.
The function lives in the tibble package and is the column-axis counterpart of add_row(). Where dplyr::mutate() is best for columns derived from existing ones, add_column() is best for columns whose values you supply directly as vectors. It also offers position control through .before and .after, which mutate() does not.
Syntax
add_column() takes the tibble first, then name-value pairs, then optional position arguments. Vectors must match nrow(.data) exactly; scalars recycle to fill every row.
The full signature is:
add_column(.data, ..., .before = NULL, .after = NULL, .name_repair = "check_unique")
Arguments:
.data: the source tibble or data frame....: name-value pairs. Names become new column names; values are vectors of lengthnrow(.data)or scalars..before: 1-based column index or column name. New columns are inserted before this position..after: 1-based column index or column name. New columns are inserted after this position..name_repair: strategy for fixing duplicate or non-syntactic names. Default"check_unique"rejects duplicates outright.
If both .before and .after are NULL (the default), new columns append at the right.
add_column() for readable column-append chains. Because add_column() returns a new tibble, it composes with the native pipe: df |> add_column(z = 1:3) |> add_column(w = letters[1:3]). The pattern reads top-to-bottom in the order columns arrive, which is easier to scan than nested calls.Six common patterns
1. Append a column at the end
With no position argument, the new column goes to the right of every existing column. Column order in subsequent reads follows the order of insertion.
2. Append multiple columns in one call
Each pair becomes one column. The new columns appear in the order you write them, side by side after the original columns.
3. Insert before or after a named column
You can reference the insertion point by name ("y") or by 1-based index (1). Names survive column reordering, which makes the call self-documenting and refactor-safe.
4. Scalar values recycle across rows
A length-1 value recycles to match nrow(df). Lengths other than 1 or nrow(df) are rejected rather than recycled silently.
5. Insert at the start with .before = 1
.before = 1 puts the new column at position one. .after = ncol(df) is identical to the default append behavior.
6. Handle name conflicts with .name_repair
Default "check_unique" rejects duplicates outright. Switch to "unique" (or "universal") when you need the call to succeed by suffixing the duplicate name.
add_column() vs mutate() vs bind_cols() vs cbind()
Pick add_column() for inline columns supplied as literal vectors. Each of the four options solves a different problem; choosing the wrong one is the most common confusion in this corner of the tidyverse.
| Behavior | add_column() |
dplyr::mutate() |
dplyr::bind_cols() |
cbind() |
|---|---|---|---|---|
| Source | tibble | dplyr | dplyr | base R |
| Best for | Inline literal vectors | Computed from existing cols | Joining two tibbles by column | Legacy data frames |
| Position control | .before, .after |
.before, .after |
Append only | Append only |
| Length rule | nrow or 1 | nrow or 1 (per group) | nrow must match | Recycles silently |
| Duplicate names | Errors by default | Overwrites | Auto-repairs | Allows duplicates |
| Returns | tibble | tibble | tibble | data.frame |
When to use which:
- Use
add_column()to splice in columns you supply as inline vectors, especially when position matters. - Use
mutate()to compute new columns from existing ones; do not useadd_column()for derived values. - Use
bind_cols()to join two tibbles side-by-side when both already contain complete columns. - Use
cbind()only in base R workflows where tidyverse is not loaded.
add_column() returns a new tibble; it does not mutate. Forgetting to assign the result back is the single most common mistake. The original tibble stays unchanged after add_column(df, z = 1:3). To persist the column, write df <- add_column(df, z = 1:3) or pipe into a chain that rebinds at the end. This is the standard tidyverse pattern: pure functions, explicit assignment.Common pitfalls
Pitfall 1: vector length does not match nrow(.data). Every value must be length nrow(df) or length 1. Any other length errors.
Pitfall 2: trying to add an existing column name. add_column() refuses to overwrite. Use mutate() to replace a column or pass .name_repair = "unique" to suffix the new one.
.before and .after cannot both be set in one call. Passing both arguments triggers an error. Each accepts either an integer index or a column name; pick one slot per call. Indices reference the source tibble at the moment of the call, so chained inserts shift positions as columns are added.Pitfall 3: forgetting to assign back. add_column() returns a new tibble. The original is untouched.
Try it yourself
Try it: Convert iris to a tibble, then add a Species_Code column that holds the integer codes of the Species factor (1, 2, 3). Insert the new column directly after Species. Save the result to ex_iris.
Click to reveal solution
Explanation: as_tibble() converts iris from a base data frame to a tibble. as.integer() on the Species factor returns its underlying numeric codes. .after = "Species" places the new column directly to the right of Species, keeping related fields adjacent.
Related tibble functions
Alongside add_column(), look at:
add_row(): extend a tibble with new rows instead of new columns.tibble()andtribble(): build a tibble from scratch, column-by-column or row-by-row.as_tibble(): convert a data frame, list, or matrix into a tibble.dplyr::mutate(): create or replace columns computed from existing ones.dplyr::bind_cols(): combine two tibbles side-by-side when both contain full columns.dplyr::relocate(): move existing columns to new positions without changing their values.
For the full reference, see the official tibble documentation.
FAQ
How do you add a column to a tibble in R?
Call add_column() from the tibble package with the source tibble as the first argument, then one name-value pair per column to create. Example: add_column(df, z = 1:3) appends a column z with values 1, 2, 3. Each value must be length nrow(df) or length 1. The function returns a new tibble, so assign the result back: df <- add_column(df, z = 1:3) to persist the change.
What is the difference between add_column() and mutate()?
add_column() adds a column whose values you supply as a literal vector: add_column(df, z = c(10, 20, 30)). dplyr::mutate() creates or replaces columns whose values are computed from existing ones: mutate(df, z = x + 1). Use add_column() for hand-written or externally sourced vectors and when you need .before/.after position control. Use mutate() for expressions referring to other columns, group-wise computation, and overwriting existing columns.
Can add_column() insert at a specific position?
Yes. Pass .before = N to insert to the left of column N, or .after = N to insert to the right. N can be a 1-based integer index or a column name. Example: add_column(df, z = 1:3, .before = "y") places z directly before column y. Without either argument, new columns append at the right. You cannot pass both .before and .after in the same call, and indices reference the source tibble at the moment of the call.
Why does add_column() error with "column names must not be duplicated"?
add_column() refuses to overwrite an existing column by default. If .data already contains a column named x, passing x = ... raises that error. To replace an existing column, use dplyr::mutate(df, x = ...). To keep both columns under different names, pass .name_repair = "unique", which suffixes the duplicate to x...1, x...3, and so on, preserving values without collision.
Does add_column() modify the original tibble?
No. add_column() returns a new tibble; the original is unchanged. This is the standard tidyverse pattern: functions are pure, and changes only persist when you assign the result back. Write df <- add_column(df, z = 1:3) or chain through the pipe: df <- df |> add_column(z = 1:3) |> add_column(w = 4:6). Treating add_column() like an in-place spreadsheet edit is the most common bug.