dplyr cumany() in R: Cumulative Any-True Across a Vector
The cumany() function in dplyr returns FALSE until the first TRUE in a logical vector, then TRUE for every position thereafter. It is the cumulative version of any() and the mirror of cumall().
cumany(c(FALSE, FALSE, TRUE, FALSE)) # F F T T cumany(x > 100) # mark from first big value onward df |> filter(cumany(triggered)) # rows from first trigger df |> arrange(time) |> filter(cumany(start_event)) cumall(c(TRUE, FALSE)) # mirror: T F (until first FALSE) which(cumany(x))[1] # find first TRUE position
Need explanation? Read on for examples and pitfalls.
What cumany() does in one sentence
cumany(x) returns FALSE for each position UNTIL the first TRUE is seen; from there onward, every position is TRUE. It is the cumulative version of any().
The opposite of cumall(). Together they cover "before / from first match" and "until / before first failure" idioms.
Syntax
cumany(x). x is a logical vector. Returns a logical vector of the same length.
filter(cumany(condition)) keeps rows from the FIRST row where condition is TRUE through the END. It is "drop everything before the first match" rather than "keep everything where match is TRUE".Five common patterns
1. From first big value onward
2. Filter from first triggered event
Once the trigger fires (row 3), every later row is kept regardless of subsequent trigger state.
3. Per-group cumany
4. Time-series "after first error" filter
Useful for post-incident analysis: ignore everything before the first failure.
5. Find position of first TRUE
which.max on a logical vector returns the position of the first TRUE. Slightly different result from cumany but related semantically.
cumany() is "have we seen at least one TRUE so far?". Once the answer becomes yes, it stays yes for every subsequent position. This makes it perfect for "mark everything from event X onward" filters.cumany() vs cumall() vs any() vs which.max()
Four functions for handling "first TRUE" semantics in R.
| Function | Returns | Best for |
|---|---|---|
cumany(x) |
Vector: FALSE -> TRUE at first TRUE | "From first match onward" |
cumall(x) |
Vector: TRUE -> FALSE at first FALSE | "Until first failure" |
any(x) |
Single boolean | "Is there at least one TRUE?" |
which.max(x) |
Single integer | Position of first TRUE |
min(which(x)) |
Single integer | Same; explicit |
When to use which:
cumanyto keep rows from a marker onward.cumallto drop rows after a marker.anyfor one-shot test.which.maxto find the position only.
A practical workflow
The "post-event window" pattern is the cumany sweet spot.
Returns chronologically all rows from the first event onward. Common in:
- Customer journey analysis: from first purchase onward
- Anomaly detection: from first outlier onward
- Cohort analysis: from first signup onward
The opposite-direction filter uses cumall:
Returns rows up until the first failure.
Common pitfalls
Pitfall 1: order matters. cumany reads left-to-right. Always arrange() first if order is meaningful.
Pitfall 2: NA propagation. cumany(c(FALSE, NA, TRUE)) returns c(FALSE, NA, TRUE). The NA is preserved in place. Pre-filter NAs if you want strict TRUE/FALSE results.
filter(cumany(cond)) is NOT the same as filter(cond). filter keeps EVERY row where cond is TRUE; cumany filter keeps every row from the first TRUE onward (including subsequent FALSE rows). Different semantics.Try it yourself
Try it: From a sequence of daily login records, keep only rows from the first day the user logged in onward (drop pre-signup zeros). Save to ex_active.
Click to reveal solution
Explanation: cumany(logged_in == 1) becomes TRUE at day 3 (first login) and stays TRUE for days 4-6 even though the user didn't log in every day.
Related dplyr functions
After mastering cumany, look at:
cumall(): mirror; "until first FALSE"cummean(): running meancumsum(),cumprod(),cummin(),cummax(): base R cumulativeslead()/lag(): shift values for transition detectionrle(): run-length encodingsliderpackage: rolling-window operations
For window-based cumulative computations (e.g., last N days), the slider package is more flexible than the cumulative family.
FAQ
What does cumany do in dplyr?
cumany(x) returns FALSE until the first TRUE in x, then TRUE for every later position. It is the running version of any().
What is the difference between cumany and cumall?
cumany is "any TRUE so far": once any TRUE appears, the running result is TRUE. cumall is "all TRUE so far": once any FALSE appears, the running result is FALSE. Mirrors of each other.
How do I keep rows from the first match onward?
df |> filter(cumany(condition)). Sort first if order matters. The filter keeps rows from the first TRUE position onward.
Does cumany handle NA?
NAs propagate per position: cumany(c(FALSE, NA, TRUE)) returns c(FALSE, NA, TRUE). The NA stays in place. Pre-filter NAs for strict TRUE/FALSE behaviour.
What is the difference between cumany and any?
any(x) returns ONE boolean: TRUE if any element of x is TRUE. cumany(x) returns a vector of the same length: FALSE until the first TRUE, then TRUE thereafter.