Chi-Square Test Calculator
A chi-square test asks whether two categorical variables are related (like sex and smoking status), or whether observed counts match what you would expect by chance. Paste a 2D table or a row of observed counts to get chi-square, p-value, Cramer's V, standardized residuals, and a mosaic plot.
New to chi-square? Read the 4-min primer ▾
What it is. The chi-square test asks one question of a table of counts: are these counts farther from what we’d expect under independence (or under a hypothesised distribution) than chance can comfortably explain? If yes, the row variable and the column variable are not independent; some cells carry a real signal.
How to read it. Three flavours, one statistic. Independence takes a two-way table (e.g. treatment × outcome) and tests whether the two classifications are linked. Goodness-of-fit takes one row of counts and tests whether they match a hypothesised distribution (uniform dice, expected proportions). Homogeneity tests whether several samples share the same distribution; the math is identical to independence.
The recipe. For each cell compute the expected count under the null: E = row_total × col_total / N for independence, or E = N × p_hyp for goodness-of-fit. Sum the squared, scaled deviations: χ² = Σ (O − E)² / E. Compare against a chi-square distribution with df = (rows−1)(cols−1) for independence or df = k−1 for goodness-of-fit.
Beyond the p-value. A small p tells you something is off; standardised residuals tell you where. Cells with |r| > 1.96 drive the result. Cramer’s V (independence) and Cohen’s w (goodness-of-fit) put a magnitude on the association so you can say not just “significant” but “weak / moderate / strong”.
Try a real-world example to load.
Vaccinated vs unvaccinated, infected vs not. Are vaccination and infection independent?
We checked whether your observed counts deviate from what independence (or your expected proportions) would predict.
Read more Anatomy of the chi-square test
qchisq(1 − α, df).(O − E)/√E are easy but biased. The standardised version (matches R’s chisq.test()$stdres) divides by an SE that accounts for the marginal totals; under H0 each cell is approximately N(0, 1). Cells with |r| > 1.96 drive the result.chisq.test(x, correct = FALSE)); switch on for small 2×2 tables, or use Fisher’s exact directly.Caveats When this is the wrong tool
- If you have…
- Use instead
- Any expected count < 5 (especially in 2×2)
- Switch to Fisher’s exact test - the chi-square approximation degrades when expected cells are sparse. The handoff button above does it in one click.
- Paired binary outcomes (before/after on the same subject)
- McNemar’s test - chi-square treats rows as independent, which paired data are not. Use
mcnemar.test(). - Ordinal categories (low / med / high) where order matters
- The Cochran-Armitage trend test or a polychoric correlation. Chi-square ignores ordering and wastes power.
- Stratified 2×2 tables (e.g. across hospitals)
- Cochran-Mantel-Haenszel test - pooling strata via plain chi-square risks Simpson’s paradox.
- Continuous variables binned into categories
- Don’t bin first - use a t-test, ANOVA, or correlation. Binning throws away information and chi-square’s p-value depends on the cut points.
- 3+ way (multidimensional) tables
- Log-linear models or hierarchical contingency analysis. Chi-square only handles two classifications cleanly.
- Chi-square test of independence in R - the full tutorial with R code and interpretation.
- Chi-square goodness-of-fit test in R - testing observed counts against a hypothesised distribution.
- Fisher’s exact test in R - the small-sample alternative when expected counts < 5.
- Categorical data in R - encoding, factors, and contingency tables.
- Confidence interval calculator - for inference on a single proportion or difference of proportions.
Numerical accuracy: χ², df, and p match R’s chisq.test() to 4 decimals; standardised residuals match chisq.test()$stdres; Cramer’s V matches rcompanion::cramerV().