purrr discard() in R: Drop List Elements by Predicate
The purrr discard() function in R drops every list or vector element where a predicate function returns TRUE, keeping the rest. It is the fastest way to strip NA values, empty strings, or any element failing a test.
discard(x, is.na) # drop NA elements discard(x, \(v) v < 0) # drop by a condition discard(x, ~ .x == 0) # formula shorthand discard(x, is.character) # drop by type discard_at(x, "key") # drop by name discard_at(x, c(2, 4)) # drop by position keep(x, \(v) v > 0) # inverse: keep matches
Need explanation? Read on for examples and pitfalls.
What purrr discard() does
discard() removes the elements you do not want. It walks a list or vector, applies a predicate function to each element, and drops every element for which the predicate returns TRUE. Everything that returns FALSE survives. That makes it the natural tool for cleaning a collection: strip NA values, empty strings, or any element that fails a test.
Because discard() keeps the opposite of keep(), you reach for it whenever the rule for rejection is shorter to write than the rule for acceptance.
is.na returns TRUE for an element, that element is dropped. Reading the predicate as "what to remove" prevents the most common logic error with this function.discard() syntax and arguments
discard() takes two core arguments plus optional extras. The signature is discard(.x, .p, ...), and each part has a specific job.
.x: the list or atomic vector to filter..p: the predicate function applied to each element. It must return a single TRUE or FALSE....: extra arguments forwarded to.pon every call.
You can supply the predicate in three styles. Pass a named function like is.na, an anonymous function with the \(x) shorthand, or a formula where .x stands for the current element.
The result preserves names and the input type. Discarding from a list returns a list; discarding from an atomic vector returns an atomic vector.
discard() examples by use case
Real cleaning tasks fall into a few repeating shapes. These four examples cover the cases you will meet most often.
Drop empty strings from a list of text by testing string length:
The formula syntax does the same job with less typing. Inside a formula, .x is the element being tested:
When you know the name of the element to remove, discard_at() skips the predicate entirely and targets names or positions directly:
A data frame is a list of columns, so discard() works on it column by column. Here it drops every character column and leaves the numeric ones:
keep() vs discard(): two sides of the same filter
keep() and discard() are mirror images. Both apply a predicate to every element; they differ only in which result they retain. Pick the one whose rule is simpler to express.
| Function | Retains elements where the predicate is | Use when |
|---|---|---|
keep() |
TRUE | you can describe what to keep |
discard() |
FALSE | you can describe what to remove |
compact() |
length greater than 0 | you only want to drop NULL or empty elements |
Filter() (base R) |
TRUE | no purrr dependency is available |
compact() is a focused shortcut: it is equivalent to discard() with a length test, but reads better when removing empties is the only goal. Use discard() when the rejection rule is anything more specific than "empty".
discard() is a list comprehension that filters out matches, such as [v for v in x if not v < 0], or filterfalse() from the itertools module.Common pitfalls
Most discard() errors trace back to the predicate. The function is strict: .p must return exactly one TRUE or one FALSE per element.
A predicate that returns a vector longer than one fails. Calling is.na on a multi-value element produces a logical vector, not a single flag:
A predicate that returns NA fails the same way. Comparing an NA element with > yields NA, which is neither TRUE nor FALSE:
discard() gives back list(). Downstream code that assumes at least one value will break. Check with length() before using the result.Try it yourself
Try it: From the list prices <- list(10, NA, 25, 0, NA, 8), use discard() to drop both the NA values and the zero. Save the result to ex_clean.
Click to reveal solution
Explanation: The predicate returns TRUE for NA elements and for zero, so discard() drops both. || short-circuits, so x == 0 is never tested on an NA value.
Related purrr functions
discard() sits in a small family of filtering verbs. Reach for a neighbour when the task shifts from rejection to selection or lookup.
- keep() keeps elements where the predicate is TRUE, the exact inverse of
discard(). - compact() drops only NULL and zero-length elements.
discard_at()andkeep_at()filter by name or position instead of value.- detect() returns the first element that matches a predicate.
- map() transforms elements once you have filtered the ones you want.
See the official purrr keep and discard reference for the full argument list.
FAQ
What is the difference between keep() and discard() in purrr?
Both functions test every element with a predicate. keep() retains the elements where the predicate returns TRUE, while discard() retains the elements where it returns FALSE. They are exact opposites, so discard(x, p) equals keep(x, negate(p)). Choose whichever lets you write the simpler rule: keep when describing what to retain is easier, discard when describing what to remove is easier.
How do I remove NA values from a list with discard()?
Pass is.na as the predicate: discard(my_list, is.na). The is.na function returns TRUE for each NA element, and discard() drops those, leaving only the non-missing values. This works when every element is a single value. If elements are multi-value vectors, use anyNA instead so the predicate still returns a single TRUE or FALSE.
Can discard() be used on a data frame?
Yes. A data frame is a list of columns, so discard() applies the predicate to each column. For example, discard(df, is.character) drops every character column and returns a data frame of the remaining columns. This is a quick way to keep only numeric columns before modelling or to strip identifier columns before a summary.
Why does discard() throw a "single TRUE or FALSE" error?
The predicate returned something other than one logical value. Common causes are a function like is.na applied to a multi-element vector, which returns a logical vector, or a comparison against an NA value, which returns NA. Fix it by collapsing the result with anyNA or any, or by guarding the comparison with !is.na(x) && so the predicate always returns a single TRUE or FALSE.