janitor tabyl() in R: Frequency and Cross-Tab Tables

The tabyl() function in janitor builds tidy frequency tables and cross-tabulations from a data frame in one call. It returns a real data frame (not a matrix), accepts one, two, or three variables, and chains with adorn_* helpers to add totals, percentages, and formatted labels.

By Selva Prabhakaran · Published May 22, 2026 · Last updated May 22, 2026

⚡ Quick Answer

tabyl(mtcars, cyl)                                         # 1-way frequency
tabyl(mtcars, cyl, gear)                                   # 2-way cross-tab
mtcars |> tabyl(cyl)                                       # pipe-friendly
tabyl(mtcars, cyl) |> adorn_totals()                       # add row total
tabyl(mtcars, cyl, gear) |> adorn_percentages("row")       # row %
tabyl(mtcars$cyl, show_na = FALSE)                         # vector input
tabyl(mtcars, cyl, gear, am)                               # 3-way: list

Need explanation? Read on for examples and pitfalls.

📊 Is tabyl() the right tool?

What tabyl() does in one sentence

tabyl() counts how often each level (or combination of levels) appears in a column and returns the result as a data frame ready for printing or piping. It is janitor's drop-in replacement for base::table(), with two upgrades: the output is tidy enough to feed to ggplot2 or kable, and the adorn_* family adds totals, percentages, and decorations without leaving the pipe.

The function shines for exploratory data analysis where you need a quick count of categories, a row-percent cross-tab, or a publication-ready frequency table in a report.

Syntax

tabyl() takes one to three column names from a data frame, or a single vector. The first variable forms rows; the second forms columns; the third splits the result into a named list of 2-way tables.

Run live

Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.

RLoad janitor and inspect mtcars

library(janitor) library(dplyr) mtcars |> tabyl(cyl) #> cyl n percent #> 4 11 0.34375 #> 6 7 0.21875 #> 8 14 0.43750

The full signature:

tabyl(dat, var1, var2, var3, show_na = TRUE, show_missing_levels = TRUE, ...)

Only dat (or a vector) is required. show_na controls whether NA appears as its own row; show_missing_levels keeps unused factor levels in the output even when their count is zero.

Tip

Reach for tabyl() instead of table() whenever you need a downstream pipeline. table() returns a matrix with awkward dimnames; tabyl() returns a data frame you can pass straight to ggplot(), kable(), or gt(). That single property removes a lot of as.data.frame.matrix() glue from EDA scripts.

Six common patterns

1. One-way frequency on a single column

RCount one categorical variable

mtcars |> tabyl(gear) #> gear n percent #> 3 15 0.46875 #> 4 12 0.37500 #> 5 5 0.15625

The output has three columns: the variable, the count n, and the proportion percent (note: it is a proportion in [0, 1], not a formatted "%" string). Sort with arrange(desc(n)) if you want the largest categories first.

2. Two-way cross-tabulation

RCross-tab two variables

mtcars |> tabyl(cyl, gear) #> cyl 3 4 5 #> 4 1 8 2 #> 6 2 4 1 #> 8 12 0 2

With two variables, tabyl() returns a wide data frame: rows are levels of cyl, columns are levels of gear, and cells hold the joint count. The first column is the row label, not row names, so you can pipe the result without losing the grouping variable.

3. Three-way table as a named list

RThree-way split returns a list

res <- mtcars |> tabyl(cyl, gear, am) names(res) #> [1] "0" "1" res[["0"]] #> cyl 3 4 5 #> 4 1 2 0 #> 6 2 2 0 #> 8 12 0 0

A third variable produces one 2-way table per level of that variable, returned as a named list. Index by level (res[["1"]]) or apply a function across all slices with purrr::map() or lapply().

4. Add totals with adorn_totals()

RAppend a row total, column total, or both

mtcars |> tabyl(cyl, gear) |> adorn_totals(c("row", "col")) #> cyl 3 4 5 Total #> 4 1 8 2 11 #> 6 2 4 1 7 #> 8 12 0 2 14 #> Total 15 12 5 32

adorn_totals() accepts "row", "col", "both", or c("row", "col"). The label of the total row defaults to "Total"; pass name = "Sum" to change it. Totals respect existing decorations, so you can layer them with percentages without re-counting.

5. Convert counts to percentages

RRow percentages, formatted

mtcars |> tabyl(cyl, gear) |> adorn_percentages("row") |> adorn_pct_formatting(digits = 1) #> cyl 3 4 5 #> 4 9.1% 72.7% 18.2% #> 6 28.6% 57.1% 14.3% #> 8 85.7% 0.0% 14.3%

The chain replaces every count with its proportion within the chosen denominator ("row", "col", or "all"), then adorn_pct_formatting() rounds and appends the percent sign. Stop after adorn_percentages() if you want numeric proportions for further math; continue to adorn_pct_formatting() for display.

6. Pair counts with percentages using adorn_ns()

RShow counts alongside percentages

mtcars |> tabyl(cyl, gear) |> adorn_percentages("row") |> adorn_pct_formatting(digits = 1) |> adorn_ns() #> cyl 3 4 5 #> 4 9.1% (1) 72.7% (8) 18.2% (2) #> 6 28.6% (2) 57.1% (4) 14.3% (1) #> 8 85.7% (12) 0.0% (0) 14.3% (2)

adorn_ns() glues the raw count back into each cell in parentheses. This is the standard format for academic tables and clinical reports, where reviewers want to see both the rate and the underlying sample size in a single cell.

Key Insight

The adorn chain is order-sensitive. Run adorn_totals() BEFORE adorn_percentages() so the totals get counted as raw numbers, not as proportions of themselves. Run adorn_pct_formatting() AFTER adorn_percentages() so the formatter sees decimals, not characters. Mixing the order silently produces wrong numbers without throwing an error.

tabyl() vs table() vs dplyr::count()

Three idioms produce the same counts; the differences are output shape and chain ergonomics.

Task	`tabyl()`	`table()`	`dplyr::count()`
Output type	data frame	matrix or array	tibble
1-way table	`tabyl(df, x)`	`table(df$x)`	`count(df, x)`
2-way cross-tab	`tabyl(df, x, y)`	`table(df$x, df$y)`	`count(df, x, y)` (long)
Add row totals	`adorn_totals("row")`	`addmargins(t, 1)`	manual `summarise`
Row percentages	`adorn_percentages("row")`	`prop.table(t, 1)`	manual `mutate(p = n/sum(n))`
Pipe-friendly	yes	partial	yes

When to use which:

Use tabyl() when the goal is a printed or formatted table (report, knitr, dashboard).
Use dplyr::count() when the result feeds another transformation (group_by, join, ggplot).
Use table() only when interoperating with base R functions like chisq.test() or prop.table() that expect a matrix.

Note

Coming from Python pandas? The equivalent of tabyl(df, x, y) is pd.crosstab(df.x, df.y). The decoration helpers map roughly to pd.crosstab(..., margins=True, normalize="index").round(3).mul(100).astype(str).add("%"), which is exactly the kind of one-liner janitor's adorn_* chain replaces with named verbs.

Common pitfalls

Pitfall 1: forgetting that percent is a proportion. The default percent column holds values in [0, 1] (for example 0.34375), not "34.4%". If a downstream tool needs strings with a percent sign, call adorn_percentages() then adorn_pct_formatting() to convert. Misreading the column as already formatted shows up as "34" axis labels in a chart.

Pitfall 2: applying adorn_percentages without first deciding on a denominator. adorn_percentages() defaults to "row". If you wanted column percentages, you must pass "col"; for the grand-total denominator, pass "all". The wrong choice silently produces a plausible-looking table that answers a different question.

Warning

show_na = TRUE is the default and changes your totals. If cyl has 30 non-NA values and 2 NAs, the percent column on tabyl(df, cyl) divides by 32, not 30. Pass show_na = FALSE when you want percentages out of the non-missing population, or filter out NAs before tabulating. Forgetting this is a frequent source of "my percents don't sum to what I expected" bug reports.

Pitfall 3: assuming numeric levels stay numeric. tabyl() converts the grouping variable to character for display. If you arrange or filter the result expecting numeric ordering, sort levels explicitly with factor(cyl, levels = c("4","6","8")) before tabulating, or coerce after with as.numeric().

Try it yourself

Try it: Take mtcars, build a 2-way tabyl() of gear by am, then add row totals AND row percentages formatted to one decimal. Save the result to ex_tab.

RYour turn: gear by am, with totals and row %

# Try it: build a tabyl with totals and row percentages ex_tab <- mtcars |> tabyl(# your code here) ex_tab #> Expected: row percents per gear level + Total column

Click to reveal solution

RSolution

ex_tab <- mtcars |> tabyl(gear, am) |> adorn_totals("col") |> adorn_percentages("row") |> adorn_pct_formatting(digits = 1) ex_tab #> gear 0 1 Total #> 3 100.0% 0.0% 100.0% #> 4 33.3% 66.7% 100.0% #> 5 0.0% 100.0% 100.0%

Explanation: adorn_totals("col") adds a Total column BEFORE percentages, so each row's totals are computed against the right denominator. Reversing the order would percent-format the totals as 1.000 then re-divide, producing nonsense.

After mastering tabyl(), look at:

adorn_totals(): append row, column, or both totals to a tabyl or any data frame
adorn_percentages(): convert counts to proportions using row, column, or grand-total denominators
adorn_pct_formatting(): round proportions and append the percent sign for display
adorn_ns(): glue raw counts in parentheses next to formatted percentages
adorn_title(): add a pretty top-row title to a 2-way tabyl for printed reports

For a fuller tour of the janitor package, see the janitor package guide. The package's official reference site is sfirke.github.io/janitor.

FAQ

What does janitor tabyl() do?

tabyl() produces a tidy frequency table from one or more columns of a data frame. With a single variable it returns counts and proportions; with two variables it returns a wide cross-tab; with three it returns a named list of 2-way tables, one per level of the third variable. The output is always a regular data frame, so it pipes cleanly into ggplot, kable, or further dplyr verbs.

How is tabyl() different from base R table()?

table() returns a matrix with dim names, which is awkward to print, awkward to subset, and breaks pipes. tabyl() returns a data frame with the grouping variable as a real column, ready for the rest of a tidyverse workflow. The numbers are identical; the shape and ergonomics differ.

How do I get percentages from tabyl()?

Pipe the result into adorn_percentages(), then optionally adorn_pct_formatting() for the percent sign. Use "row", "col", or "all" as the denominator argument. To show both percent and count, chain adorn_ns() after adorn_pct_formatting().

Can tabyl() handle missing values?

Yes. By default show_na = TRUE keeps NA as its own row in 1-way tables and as its own row/column in 2-way tables. The percent denominator includes the NA count, so set show_na = FALSE if you want proportions out of the non-missing population only.

Does tabyl() work on a vector?

Yes. tabyl(mtcars$cyl) produces the same 1-way table as tabyl(mtcars, cyl). The vector form is handy inside ad-hoc EDA where you have not yet assigned the data to a frame, or when working with a single column extracted from a list.

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

janitor tabyl() in R: Frequency and Cross-Tab Tables

What tabyl() does in one sentence

Syntax

Six common patterns

1. One-way frequency on a single column

2. Two-way cross-tabulation

3. Three-way table as a named list

4. Add totals with adorn_totals()

5. Convert counts to percentages

6. Pair counts with percentages using adorn_ns()

tabyl() vs table() vs dplyr::count()

Common pitfalls

Try it yourself

FAQ

Navigate

Tidyverse packages

Deep dives

Wrangling & EDA

Statistics

Machine Learning

Time Series

By Industry

Reporting & Apps

Levels

janitor tabyl() in R: Frequency and Cross-Tab Tables

What tabyl() does in one sentence

Syntax

Six common patterns

1. One-way frequency on a single column

2. Two-way cross-tabulation

3. Three-way table as a named list

4. Add totals with adorn_totals()

5. Convert counts to percentages

6. Pair counts with percentages using adorn_ns()

tabyl() vs table() vs dplyr::count()

Common pitfalls

Try it yourself

Related janitor functions

FAQ

Related Tutorials