Rr‑statistics.co

Chi-Square Test Calculator

A chi-square test asks whether two categorical variables are related (like sex and smoking status), or whether observed counts match what you would expect by chance. Paste a 2D table or a row of observed counts to get chi-square, p-value, Cramer's V, standardized residuals, and a mosaic plot.

i New to chi-square? Read the 4-min primer

What it is. The chi-square test asks one question of a table of counts: are these counts farther from what we’d expect under independence (or under a hypothesised distribution) than chance can comfortably explain? If yes, the row variable and the column variable are not independent; some cells carry a real signal.

How to read it. Three flavours, one statistic. Independence takes a two-way table (e.g. treatment × outcome) and tests whether the two classifications are linked. Goodness-of-fit takes one row of counts and tests whether they match a hypothesised distribution (uniform dice, expected proportions). Homogeneity tests whether several samples share the same distribution; the math is identical to independence.

The recipe. For each cell compute the expected count under the null: E = row_total × col_total / N for independence, or E = N × p_hyp for goodness-of-fit. Sum the squared, scaled deviations: χ² = Σ (O − E)² / E. Compare against a chi-square distribution with df = (rows−1)(cols−1) for independence or df = k−1 for goodness-of-fit.

Beyond the p-value. A small p tells you something is off; standardised residuals tell you where. Cells with |r| > 1.96 drive the result. Cramer’s V (independence) and Cohen’s w (goodness-of-fit) put a magnitude on the association so you can say not just “significant” but “weak / moderate / strong”.

3 modes · one tool · Independence · Goodness-of-fit · Homogeneity · Runs in your browser

Try a real-world example to load.

🧪 2×2 vaccine

Vaccinated vs unvaccinated, infected vs not. Are vaccination and infection independent?

CHI-SQUARE STATISTIC
χ² = - on df = -
p = -
R code RUNNABLE
R Reproduce in R

        
Mosaic plot INTERACTIVE
Cell area ∝ observed count; tint ∝ standardised residual.
Inference

Read more Anatomy of the chi-square test
Expected counts (independence): E[i,j] = row_total[i] × col_total[j] / N
Step 1: build the null table. Under independence, the joint probability factors into row- and column-marginals. Multiply marginal probabilities, scale by the grand total, and you get the count you’d expect to see in each cell if the variables were unrelated.
Pearson chi-square: χ² = Σ_{i,j} (O[i,j] − E[i,j])² / E[i,j] df = (rows − 1) × (cols − 1)
Step 2: pool the deviations. Each cell contributes a squared, scaled deviation. Cells with a high expected count tolerate larger absolute deviations; small expected counts make a small absolute deviation contribute a lot. The sum is approximately χ² under H0; compare to qchisq(1 − α, df).
Standardised residual: r[i,j] = (O − E) / sqrt(E · (1 − row%) · (1 − col%))
Step 3: read the cells. Pearson residuals (O − E)/√E are easy but biased. The standardised version (matches R’s chisq.test()$stdres) divides by an SE that accounts for the marginal totals; under H0 each cell is approximately N(0, 1). Cells with |r| > 1.96 drive the result.
Effect size: Cramer’s V = sqrt(χ² / (N · min(rows−1, cols−1))) Cohen’s w = sqrt(χ² / N)
Step 4: size the effect. A significant chi-square at large N is almost guaranteed; effect size is what tells you whether the association is practically meaningful. Cramer’s V (independence) and Cohen’s w (goodness-of-fit) both range from 0 to 1; conventional benchmarks: 0.1 small, 0.3 medium, 0.5 large.
Yates’ correction (2×2 only): χ²_Y = Σ (|O − E| − 0.5)² / E
Step 5: optional correction. The chi-square approximation can be poor for 2×2 tables with small expected counts. Yates’ continuity correction subtracts 0.5 from each absolute deviation before squaring; it tends to be conservative. Default is off (matching R’s chisq.test(x, correct = FALSE)); switch on for small 2×2 tables, or use Fisher’s exact directly.
Caveats When this is the wrong tool
If you have…
Use instead
Any expected count < 5 (especially in 2×2)
Switch to Fisher’s exact test - the chi-square approximation degrades when expected cells are sparse. The handoff button above does it in one click.
Paired binary outcomes (before/after on the same subject)
McNemar’s test - chi-square treats rows as independent, which paired data are not. Use mcnemar.test().
Ordinal categories (low / med / high) where order matters
The Cochran-Armitage trend test or a polychoric correlation. Chi-square ignores ordering and wastes power.
Stratified 2×2 tables (e.g. across hospitals)
Cochran-Mantel-Haenszel test - pooling strata via plain chi-square risks Simpson’s paradox.
Continuous variables binned into categories
Don’t bin first - use a t-test, ANOVA, or correlation. Binning throws away information and chi-square’s p-value depends on the cut points.
3+ way (multidimensional) tables
Log-linear models or hierarchical contingency analysis. Chi-square only handles two classifications cleanly.
Further reading

Numerical accuracy: χ², df, and p match R’s chisq.test() to 4 decimals; standardised residuals match chisq.test()$stdres; Cramer’s V matches rcompanion::cramerV().