Chi-Square Test Calculator
A chi-square test asks whether two categorical variables are related (like sex and smoking status), or whether observed counts match what you would expect by chance. Paste a 2D table or a row of observed counts to get chi-square, p-value, Cramer's V, standardized residuals, and a mosaic plot.
New to chi-square? Read the 4-min primer ▾
What it is. The chi-square test asks one question of a table of counts: are these counts farther from what we’d expect under independence (or under a hypothesised distribution) than chance can comfortably explain? If yes, the row variable and the column variable are not independent; some cells carry a real signal.
How to read it. Three flavours, one statistic. Independence takes a two-way table (e.g. treatment × outcome) and tests whether the two classifications are linked. Goodness-of-fit takes one row of counts and tests whether they match a hypothesised distribution (uniform dice, expected proportions). Homogeneity tests whether several samples share the same distribution; the math is identical to independence.
The recipe. For each cell compute the expected count under the null: E = row_total × col_total / N for independence, or E = N × p_hyp for goodness-of-fit. Sum the squared, scaled deviations: χ² = Σ (O − E)² / E. Compare against a chi-square distribution with df = (rows−1)(cols−1) for independence or df = k−1 for goodness-of-fit.
Beyond the p-value. A small p tells you something is off; standardised residuals tell you where. Cells with |r| > 1.96 drive the result. Cramer’s V (independence) and Cohen’s w (goodness-of-fit) put a magnitude on the association so you can say not just “significant” but “weak / moderate / strong”.
Try a real-world example to load.
Vaccinated vs unvaccinated, infected vs not. Are vaccination and infection independent?
Read more Anatomy of the chi-square test
qchisq(1 − α, df).(O − E)/√E are easy but biased. The standardised version (matches R’s chisq.test()$stdres) divides by an SE that accounts for the marginal totals; under H0 each cell is approximately N(0, 1). Cells with |r| > 1.96 drive the result.chisq.test(x, correct = FALSE)); switch on for small 2×2 tables, or use Fisher’s exact directly.Caveats When this is the wrong tool
- If you have…
- Use instead
- Any expected count < 5 (especially in 2×2)
- Switch to Fisher’s exact test - the chi-square approximation degrades when expected cells are sparse. The handoff button above does it in one click.
- Paired binary outcomes (before/after on the same subject)
- McNemar’s test - chi-square treats rows as independent, which paired data are not. Use
mcnemar.test(). - Ordinal categories (low / med / high) where order matters
- The Cochran-Armitage trend test or a polychoric correlation. Chi-square ignores ordering and wastes power.
- Stratified 2×2 tables (e.g. across hospitals)
- Cochran-Mantel-Haenszel test - pooling strata via plain chi-square risks Simpson’s paradox.
- Continuous variables binned into categories
- Don’t bin first - use a t-test, ANOVA, or correlation. Binning throws away information and chi-square’s p-value depends on the cut points.
- 3+ way (multidimensional) tables
- Log-linear models or hierarchical contingency analysis. Chi-square only handles two classifications cleanly.
- Chi-square test of independence in R - the full tutorial with R code and interpretation.
- Chi-square goodness-of-fit test in R - testing observed counts against a hypothesised distribution.
- Fisher’s exact test in R - the small-sample alternative when expected counts < 5.
- Categorical data in R - encoding, factors, and contingency tables.
- Confidence interval calculator - for inference on a single proportion or difference of proportions.
Numerical accuracy: χ², df, and p match R’s chisq.test() to 4 decimals; standardised residuals match chisq.test()$stdres; Cramer’s V matches rcompanion::cramerV().