tidyr fill() in R: Forward-Fill Missing Values

The fill() function in tidyr fills NA values in a column with the most recent non-NA value above (or below). It is the "last observation carried forward" (LOCF) operation, common in time-series data.

By Selva Prabhakaran · Published May 12, 2026 · Last updated May 12, 2026

⚡ Quick Answer

df |> fill(col)                              # forward-fill (default down)
df |> fill(col, .direction = "up")           # backward-fill
df |> fill(col, .direction = "downup")       # both directions
df |> fill(c(col1, col2))                    # multiple columns
df |> group_by(g) |> fill(col)               # per-group fill

Need explanation? Read on for examples and pitfalls.

📊 Is fill() the right tool?

What fill() does in one sentence

fill(data, ..., .direction = "down") replaces NAs in the named columns with the most recent NON-NA value above (or below). Default direction is "down" (top to bottom).

The classic use case: time-series with sparse observations where each NA should inherit the previous value.

Syntax

fill(data, ..., .direction = c("down","up","downup","updown")). ... is the columns to fill.

Run live

Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.

RForward-fill missing values

library(tidyr) library(dplyr) df <- tibble(id = 1:6, status = c("a", NA, NA, "b", NA, "c")) df |> fill(status) #> id status #> 1 1 a #> 2 2 a <-- filled from row 1 #> 3 3 a #> 4 4 b #> 5 5 b <-- filled from row 4 #> 6 6 c

Tip

Use fill for time-series carrying-forward and for spreadsheet-style "category headers" that span multiple rows. Common in Excel data where the category is only printed once per group.

Five common patterns

1. Forward fill (default)

RDown: each NA inherits previous value

df |> fill(status)

2. Backward fill

RUp: each NA inherits NEXT value

df |> fill(status, .direction = "up")

3. Both directions

RFill from above, then any leading NAs from below

df |> fill(status, .direction = "downup") #> First fills down; then any remaining NAs filled up.

4. Multiple columns

RFill several at once

df |> fill(col1, col2, col3)

5. Per-group

RFill within each group

df |> group_by(user) |> fill(status) |> ungroup()

Crucial for time-series with multiple subjects: fill should reset at each user boundary.

Key Insight

fill() is the tidyverse name for LOCF (last observation carried forward). Common in time-series, sensor data, and "spreadsheet category headers" that span multiple rows. For per-subject time-series, ALWAYS group_by before fill.

fill() vs replace_na() vs coalesce()

Three NA-handling functions in tidyr/dplyr.

Function	Behavior	Best for
`tidyr::fill()`	Carry forward / backward	Time-series LOCF
`tidyr::replace_na()`	Replace NA with constant	Default value
`dplyr::coalesce()`	First non-NA across columns	Multi-source fallback
`dplyr::na_if()`	Replace specific value with NA	Sentinel cleanup

When to use which:

fill for sequential / time-series fill.
replace_na for "if NA then X" (constant).
coalesce for multi-source fallback.
na_if for sentinel values.

A practical workflow

The "carry forward state" pattern is fill's main use.

RInteractive R

events |> arrange(user, timestamp) |> group_by(user) |> fill(state) |> ungroup()

Per user, in chronological order, NA states inherit the previous known state. Without fill, state changes appear as NA between events.

For monthly reporting where the category column is only on the first row of each group:

RInteractive R

sales |> fill(category) |> group_by(category) |> summarise(total = sum(amount))

Common pitfalls

Pitfall 1: fill across group boundaries. Without group_by, fill carries forward ACROSS groups. Always group_by before fill on grouped data.

Pitfall 2: leading NAs. Default direction "down" can't fill the first row if it's NA. Use ".direction = "downup"" to handle leading NAs.

Warning

fill() does NOT verify whether filling makes semantic sense. Carrying forward a stale value may be wrong if the data is "truly missing" (not "same as previous"). Validate with domain knowledge.

Try it yourself

Try it: Forward-fill the name column in a sparse dataset, grouped by user. Save to ex_filled.

RYour turn: per-user forward fill

df <- tibble( user = c("a","a","a","b","b"), step = 1:5, name = c("Alice", NA, NA, "Bob", NA) ) ex_filled <- df |> # your code here ex_filled #> Expected: name filled within each user

Click to reveal solution

RSolution

ex_filled <- df |> group_by(user) |> fill(name) |> ungroup() ex_filled #> user step name #> 1 a 1 Alice #> 2 a 2 Alice #> 3 a 3 Alice #> 4 b 4 Bob #> 5 b 5 Bob

Explanation: group_by ensures fill resets at each user boundary; name is carried forward within each user.

After mastering fill, look at:

tidyr::replace_na(): replace NA with constant
dplyr::coalesce(): multi-source NA fill
tidyr::complete(): fill missing row combinations
dplyr::lag() / lead(): row-shift comparisons
dplyr::case_when(): conditional fill logic

For Excel-style "merged cell" data, fill is the standard import-cleanup tool.

FAQ

What does fill do in tidyr?

fill(data, col) replaces NA values in col with the most recent non-NA value above (default direction "down"). Used for last-observation-carried-forward (LOCF) imputation.

How do I fill NAs from below in tidyr?

Pass .direction = "up": fill(df, col, .direction = "up"). Each NA inherits the next non-NA value.

Should I fill before or after group_by?

After. group_by(g) |> fill(col) fills WITHIN each group. Without grouping, fill crosses boundaries.

What is the difference between fill and replace_na?

fill carries forward (or backward) the most recent non-NA value. replace_na uses a CONSTANT value. Different semantics for missing data.

Does fill modify the data in place?

No. It returns a new data frame. Always assign the result.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

tidyr fill() in R: Forward-Fill Missing Values

What fill() does in one sentence

Syntax

Five common patterns

1. Forward fill (default)

2. Backward fill

3. Both directions

4. Multiple columns

5. Per-group

fill() vs replace_na() vs coalesce()

A practical workflow

Common pitfalls

Try it yourself

FAQ

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

tidyr fill() in R: Forward-Fill Missing Values

What fill() does in one sentence

Syntax

Five common patterns

1. Forward fill (default)

2. Backward fill

3. Both directions

4. Multiple columns

5. Per-group

fill() vs replace_na() vs coalesce()

A practical workflow

Common pitfalls

Try it yourself

Related tidyr / dplyr functions

FAQ