forcats fct_other() in R: Collapse Levels Into Other
forcats fct_other() in R collapses the factor levels you do not name into a single "Other" group, using either a keep list or a drop list. It is the manual counterpart to the automatic fct_lump() family.
fct_other(f, keep = c("a", "b")) # keep these, rest -> Other
fct_other(f, drop = c("c", "d")) # drop these to Other, rest kept
fct_other(f, keep = top, other_level = "X") # custom Other label
fct_other(f, keep = levels(f)[1:3]) # keep the first 3 levels
df |> mutate(g = fct_other(g, keep = top)) # inside a dplyr pipeline
fct_count(fct_other(f, keep = top)) # tally with the Other bucketNeed explanation? Read on for examples and pitfalls.
What fct_other() does
forcats fct_other() collapses unwanted factor levels into one bucket. You hand it a factor and tell it which levels matter, and every other level is rewritten to a single catch-all level named "Other". The result stays a factor, so it drops straight into plots, models, and dplyr summaries.
Unlike the fct_lump() family, fct_other() never guesses. You decide which levels survive by name. That makes it the right tool when domain knowledge, not frequency, dictates the grouping.
Syntax
fct_other() takes one factor and exactly one of keep or drop. The full signature is short:
The arguments behave like this:
| Argument | What it does |
|---|---|
f |
The factor (or character vector) to modify. |
keep |
Character vector of level names to keep. Everything else becomes other_level. |
drop |
Character vector of level names to push into other_level. Everything else is kept. |
other_level |
The label for the catch-all level. Defaults to "Other". |
You must supply keep or drop, never both and never neither. The other_level is always appended last in the level ordering, which keeps "Other" out of the way when you plot or sort.
fct_other() replaces a verbose levels(f)[!levels(f) %in% keep] <- "Other" assignment. The forcats version is one call, handles the level ordering, and never mutates the original object.Examples by use case
Run these examples top to bottom. Each block builds on a shared session, so objects created earlier stay available in the ones that follow.
Keep only the levels you want
Pass a keep vector to whitelist levels. Any level not in the list is rewritten to "Other".
Levels c, d, and e collapse into Other, while a and b keep their original labels and order.
Drop specific levels into Other
Pass a drop vector to blacklist levels. This is the inverse of keep: the named levels go to "Other" and the rest stay untouched.
Use drop when you only need to remove a couple of nuisance levels and keep when most levels should disappear.
Rename the Other bucket
Set other_level to give the bucket a meaningful name. A label like "Other faiths" reads better on a chart axis than a bare "Other". The example below uses gss_cat, the survey dataset bundled with forcats.
All twelve minor religion levels merge into the single Other faiths level with 1,990 rows.
Use fct_other() inside a dplyr pipeline
Wrap fct_other() in mutate() to clean a column in place. This is the most common production use: collapse a high-cardinality factor right before a count() or a plot.
The eight moderate party levels collapse into Other, leaving a clean three-level factor for charts.
fct_other() vs fct_lump and friends
fct_other() groups by name; the fct_lump() family groups by frequency. Reach for fct_other() when you can list the levels that matter. Reach for a fct_lump_* function when you want the data to decide.
| Function | Groups by | You specify |
|---|---|---|
fct_other() |
Levels you name | Exact names to keep or drop |
fct_lump_n() |
Frequency rank | How many levels survive (n) |
fct_lump_prop() |
Share of rows | Minimum proportion to survive (prop) |
fct_lump_min() |
Raw count | Minimum count to survive (min) |
fct_collapse() |
Manual mapping | A new group name per set of levels |
fct_other(keep = ...) encodes it exactly. fct_lump_n() would silently change the output the moment the data shifts.Common pitfalls
1. Supplying both keep and drop. forcats rejects this immediately. The call below errors with "Must supply exactly one of keep and drop":
2. Misspelling a level name. A typo in keep is not an error. The misspelled level simply fails to match and gets swept into Other, so a wrong result ships silently. Always check spelling against levels(f).
3. Reusing an existing level name. If your data already has a level called "Other" and you set other_level = "Other", the real and collapsed values merge into one level. Pick a distinct other_level when an "Other" category already exists.
Try it yourself
Try it: Collapse gss_cat$marital so only Married and Never married survive, and label the rest Not in those two. Save the result to ex_marital.
Click to reveal solution
Explanation: keep whitelists the two levels of interest. The remaining four marital levels collapse into the custom other_level, which forcats appends last.
Related forcats functions
These functions pair well with fct_other() when you reshape categorical data:
- fct_lump_n() keeps the n most common levels automatically.
- fct_lump_prop() lumps levels below a share of the data.
- fct_collapse() merges named levels into several new groups.
- fct_recode() renames levels without collapsing them.
- fct_count() tallies levels so you can verify the result.
For the wider picture, see the guide to categorical data in R and the official forcats reference.
FAQ
What is the difference between fct_other() and fct_lump()? fct_other() collapses levels you name explicitly through keep or drop. The fct_lump() family collapses levels by frequency, such as the rarest ones or those below a proportion. Use fct_other() when a business rule fixes the grouping, and fct_lump_* when you want the data's distribution to drive it. Both return a factor with an Other level appended last.
Can fct_other() use both keep and drop at the same time? No. forcats requires exactly one of keep or drop per call and raises an error if you pass both or neither. The two arguments are complementary views of the same operation: keep whitelists levels, drop blacklists them. If you need both behaviors, run fct_other() twice or restructure the level list so a single argument covers it.
How do I rename the "Other" category in fct_other()? Set the other_level argument to any string, for example fct_other(f, keep = top, other_level = "All other"). The label appears as a normal factor level and sorts last by default. A descriptive label such as "Other regions" reads better than the generic "Other" on chart axes and in tables.
Does fct_other() work on character vectors? Yes. If you pass a character vector, forcats coerces it to a factor first, then applies the collapsing. The return value is always a factor, never a character vector. To keep code explicit, wrap the input in factor() yourself before calling fct_other().
How do I move "Other" to a specific position? fct_other() always appends other_level last. To reposition it, chain a reordering function such as fct_relevel() afterward, for example fct_relevel(result, "Other", after = 0) to push it to the front. This is common when a plot legend should list "Other" first or last regardless of the data.