dplyr case_when(): Replace Nested if_else with Clean Conditional Logic
case_when() evaluates conditions top to bottom and returns the value for the first TRUE match — like SQL's CASE WHEN. It replaces ugly nested ifelse() chains with clean, readable code.
The Problem: Nested ifelse()
case_when: The Clean Way
Conditions are evaluated top to bottom. The first TRUE match wins.
TRUEat the end is the catch-all default — likeelsein if/else.
Multiple Columns in Conditions
Handling NA
case_when vs ifelse vs if_else
| Feature | ifelse() |
dplyr::if_else() |
case_when() |
|---|---|---|---|
| Conditions | 1 | 1 | Many |
| Type safety | No | Yes (strict) | Yes |
| Readability (nested) | Poor | OK | Excellent |
| NA handling | Returns NA type | Strict | Explicit |
| Best for | Simple binary | Type-safe binary | Multi-level |
Practical Examples
Practice Exercises
Exercise 1: Categorize Cars
Create a "decade" column based on mpg: >30 "Future", 20-30 "Modern", 15-20 "Standard", <15 "Vintage".
Click to reveal solution
```rFAQ
What does TRUE ~ "default" mean?
TRUE is the catch-all — it matches any row not caught by earlier conditions. Like else in if/else. Always put it last. If you omit it, unmatched rows get NA.
Can I return different types in different conditions?
No. All right-hand-side values must be the same type. case_when(x > 0 ~ "yes", TRUE ~ 0) fails because "yes" (character) and 0 (numeric) conflict. Use case_when(x > 0 ~ 1, TRUE ~ 0) or convert.
Is case_when vectorized?
Yes. It operates on entire vectors at once — no row-by-row loop. This makes it fast, even on millions of rows.
What's Next?
- dplyr mutate & rename — the parent tutorial
- dplyr across — apply case_when logic across columns
- dplyr filter & select — filter before transforming