Rr‑statistics.co

Power Analysis

Statistical power is the probability your study will detect a real effect of a given size; too little power and you waste a study. Pick a design (t-test, ANOVA, proportions, correlation, chi-square), supply three of the four key inputs (effect, alpha, power, n), and the calculator solves for the fourth.

i New to power analysis? Read the 4-min primer

What power is. Statistical power is the probability of detecting a real effect of a given size at a chosen alpha. Power = 0.80 means that if the effect you are positing actually exists, you have an 80% chance of getting a significant result, and a 20% chance of missing it (a Type II error). Pick the design first, then ask what power you can afford.

The four-way relationship. Sample size n, effect size, alpha, and power form a closed system: pin three, the fourth is forced. Bigger effects need fewer subjects. Tighter alpha (0.01 vs 0.05) costs sample size. Higher power (0.90 vs 0.80) costs sample size. Smaller effects cost the most: halving d roughly quadruples n.

How to read the result. If you solved for n, the number you get is the smallest sample that reaches the target power. If you solved for power, you get the probability of detecting the effect at the n you can collect. If you solved for the minimum detectable effect, the number is the smallest effect you have a chance of catching given your n, alpha, and target power.

Picking the design. Continuous outcome, two groups: two-sample t-test on Cohen's d. Continuous outcome, paired or single-arm: one-sample / paired t. Binary outcome, two arms: two-proportion test on Cohen's h. Multi-arm continuous: one-way ANOVA on Cohen's f. Bivariate Pearson correlation: correlation power on r. Goodness-of-fit or contingency: chi-square on Cohen's w with correct df.

8 designs · 3 solve modes · Cohen-style benchmarks · Runs in your browser

Try a real-world example to load.

📝 two-sample t (d=0.5)

A typical two-arm trial: continuous outcome, medium effect, 80% power, alpha 0.05.

RESULT
-
R code RUNNABLE
R Reproduce in R

        
Power curve INTERACTIVE
Inference

Read more Anatomy of a power calculation
power = P( T > t_crit | H1 ) T ~ noncentral t( df, ncp ) ncp = d · sqrt( n1 · n2 / (n1 + n2) )
Noncentral t (one- and two-sample, paired, correlation). Under the alternative, the t-statistic follows a noncentral t with df from the design and a noncentrality parameter (ncp) that grows with sqrt(n) and the standardized effect. Power is the area of that noncentral distribution beyond the critical value of the central t. We use the Sankaran (1959) approximation, accurate to ~3 decimals for df ≥ 4 and faster than series methods.
power = P( F > F_crit | H1 ) F ~ noncentral F( df1, df2, ncp ) ncp_anova = f² · k · n (one-way ANOVA) ncp_reg = f² · (u + v + 1) (regression)
Noncentral F (ANOVA, regression). The omnibus F under H1 is noncentral F. The Patnaik / Poisson-mixture series we use sums central F CDFs weighted by Poisson(ncp/2) probabilities; convergence is fast around the peak. df1 is the numerator (k-1 for ANOVA, u predictors of interest for regression), df2 is the denominator (k(n-1) or n-u-1).
h = 2 · ( arcsin sqrt(p2) - arcsin sqrt(p1) ) ncp_z = |h| · sqrt(n) (two-prop, equal n per arm) ncp_z = |h| · sqrt(n) (one-prop) power = 1 - Φ( z_crit - ncp_z )
Normal approximation for proportions. Cohen's h is the variance-stabilising arcsine transform applied to each proportion, then differenced. After the transform, the test statistic is approximately normal with unit variance, so power becomes a tail probability of a shifted standard normal. Matches pwr.2p.test and pwr.p.test. Falls apart when expected counts are tiny; switch to an exact test there.
z_r = 0.5 · ln( (1+r) / (1-r) ) (Fisher z) SE = 1 / sqrt(n - 3) ncp via t = r · sqrt(df) / sqrt(1 - r²)
Correlation power. The Fisher z transform is the textbook route, but for the actual test (H0: rho = 0) the t-statistic r · sqrt(n-2) / sqrt(1-r²) is more accurate, and its noncentral-t distribution gives the closed-form power. We use that direct formulation; results match pwr.r.test to 3+ decimals.
w = sqrt( sum( (p_obs - p_exp)² / p_exp ) ) ncp_chi = w² · n power = 1 - F_chi( x_crit | df, ncp_chi )
Chi-square noncentrality. Cohen's w is the population effect size for chi-square: a normalized RMS deviation between observed and expected cell probabilities. Multiplied by n it becomes the noncentrality parameter of a chi-square with the design's df (k-1 for goodness-of-fit, (r-1)(c-1) for contingency). Power is the upper tail of that noncentral chi-square beyond the critical value.
Caveats When this is the wrong tool
If you have…
Use instead
Clustered or hierarchical data (students in classes, patients in clinics)
The independence assumption fails. Inflate your n by the design effect 1 + (m-1)ρ where m is cluster size and rho is the intraclass correlation, or run a simulation against the mixed model you actually plan to fit.
Time-to-event / survival outcome
The number of events drives power, not the number of patients. Use Schoenfeld's formula on the log hazard ratio, or simulate against a Cox model with realistic censoring; a dedicated survival power tool is on the roadmap.
Bayesian a priori sample size
Frequentist power asks "what's the long-run rejection rate?". A Bayesian sample size question (precision of a posterior, Bayes-factor design) is a different paradigm; use simulation against the prior + likelihood you actually plan to use.
Pilot-study sample size
Pilots are sized for feasibility (recruitment rate, dropout, instrument validity), not for hypothesis power. Cohen's tables don't apply; aim for ~12 per arm or follow a feasibility-specific guide.
Complex models with no closed-form (mixed effects, GLMM, nested ANOVA)
The closed forms here assume a single fixed effect with a known sampling distribution. For mixed models or non-trivial covariate structures, use simulation: generate datasets under H1, fit the model, count rejections.
Further reading

Numerical methods: noncentral t via Sankaran (1959); noncentral F and chi-square via Poisson-weighted CDF series; proportions via Cohen's h and the standard normal approximation. Verified against the R pwr package to ~3 decimal places.