dplyr count() in R: Count Rows and Frequencies
The count() function in dplyr counts the rows in a data frame, optionally grouped by one or more columns. It is shorthand for group_by() + summarise(n = n()) and the most common way to build frequency tables.
count(df) # total rows: returns n count(df, cyl) # rows per cyl count(df, cyl, gear) # rows per (cyl, gear) combination count(df, cyl, sort = TRUE) # sorted desc by count count(df, cyl, name = "n_cars") # custom column name count(df, cyl, wt = hp) # weighted (sum of hp per cyl) add_count(df, cyl) # add count col without collapsing
Need explanation? Read on for examples and pitfalls.
What count() does in one sentence
count() returns the number of rows per unique combination of grouping columns. Without arguments, it returns the total row count. With column names, it returns one row per unique combination with a column n for the count.
count(df, x) is sugar for summarise(df, n = n(), .by = x). The shortcut form is more readable for the common "frequency table" use case.
Syntax
count() takes a data frame plus optional grouping columns plus options. Add sort = TRUE to sort by count descending. Add wt = column for weighted counts.
The full signature:
count(x, ..., wt = NULL, sort = FALSE, name = NULL)
x is the data frame. ... are grouping columns. wt is an optional weighting column (sums values instead of counting rows). sort = TRUE orders the result by n descending. name overrides the default count column name n.
count() is the shortcut for the most common summarise pattern. count(df, x) produces the same result as summarise(df, n = n(), .by = x). Use count() when the only summary you want is row count; use summarise() when you want multiple statistics or non-count aggregations.Six common patterns
1. Total row count
No grouping argument means the total count of rows in the data frame.
2. Count by one group
The result has one row per unique cyl value, with column n showing the count.
3. Multi-column counts
One row per UNIQUE COMBINATION of cyl and gear.
4. Sort by count
sort = TRUE is shorthand for chaining arrange(desc(n)) after count.
5. Weighted count
wt = hp sums the hp column for each cyl group instead of counting rows. The column is still named n by default; use name = "total_hp" to rename.
6. add_count: add count column without collapsing
add_count() keeps every original row and adds a column n showing the size of each row's group. Useful for filtering ("keep only rows where group has at least 5 members").
count() collapses; add_count() preserves. count(df, x) reduces a 100-row data frame to maybe 10 rows (one per unique x). add_count(df, x) keeps all 100 rows but adds a column n showing each row's group size. Pick based on whether you want a frequency table or an annotated row-level data frame.count() vs base R alternatives
Base R uses table() for frequency counts; dplyr uses count(). The major difference is output format: table() returns a named array; count() returns a tibble.
| Task | dplyr | Base R |
|---|---|---|
| Frequency by one col | count(df, x) |
table(df$x) |
| Cross-tabulation | count(df, x, y) |
table(df$x, df$y) |
| Sorted descending | count(df, x, sort=TRUE) |
sort(table(df$x), decreasing=TRUE) |
| Weighted | count(df, x, wt=w) |
xtabs(w ~ x, data=df) |
| Add count to rows | add_count(df, x) |
(multi-step: ave + assign) |
| Output type | tibble | array (table object) |
When to use which:
- Use
count()for any analysis that continues in dplyr (output is a regular tibble). - Use
table()for quick interactive exploration; the named-array format is compact in print.
Common pitfalls
Pitfall 1: confusing count() and n(). count(df, x) is a verb that returns a new data frame. n() is a context-only function that returns the size of the current group inside summarise() or mutate(). They are related but not interchangeable.
Pitfall 2: forgetting that count() collapses rows. After count(df, x), the original row-level data is gone; you have a frequency table. To preserve every row AND add count info, use add_count(df, x) instead.
wt argument is for SUMMING, not for filtering. count(df, x, wt = w) returns sum of w per group, not a count of rows where w > 0. To filter then count, use filter(df, w > 0) |> count(x).Pitfall 3: NA in grouping column creates an NA row. count(df, x) with NAs in x returns one row with x = NA and the count of those NA rows. To exclude: filter(df, !is.na(x)) |> count(x).
Try it yourself
Try it: Count cars per gear value in mtcars, sorted descending by count. Save to ex_gears.
Click to reveal solution
Explanation: count(gear, sort = TRUE) groups by gear, counts rows in each group, then sorts the result by n descending. The sort = TRUE flag is shorthand for chaining arrange(desc(n)) after a regular count.
Related dplyr functions
After mastering count(), look at:
n(): row count helper used insidesummarise()andmutate()n_distinct(x): count unique values of x (not rows)tally(): count without grouping; nearly synonymous withcount()no-argsadd_count(): add count column to row-level data without collapsingadd_tally(): add the tally column without collapsingsummarise(.by = g, n = n(), other_stat = ...): when you need count plus other stats
For percentages and proportions, chain mutate(pct = n / sum(n)) after count.
FAQ
What is the difference between count and n() in dplyr?
count() is a verb returning a frequency table: count(df, x) returns one row per unique x value with column n for the count. n() is a HELPER used inside summarise() or mutate() that returns the size of the current group: summarise(df, num = n(), .by = x).
How do I count rows per group in dplyr?
count(df, group_col) returns one row per unique group with column n showing the count. For multiple group columns: count(df, col1, col2). To sort by count descending: add sort = TRUE.
What is the difference between count and tally in dplyr?
They are nearly identical. tally() does not group on its own; it counts whatever is currently grouped. count() is shorthand for group_by() |> tally() |> ungroup(). For most uses, just use count().
How do I count distinct values in a column with dplyr?
Use n_distinct(col) inside summarise(): summarise(df, n_unique = n_distinct(col)). This is different from count(), which counts ROWS per group, not unique values.
How do I get a frequency table with percentages?
count(df, x) |> mutate(pct = n / sum(n) * 100). The mutate() adds a pct column showing each group's percentage of the total. Chain arrange(desc(pct)) for sorted output.