What does a Bayes Factor of 10 mean?

A Bayes Factor (BF10) of 10 means the data are 10 times more likely under the alternative hypothesis than under the null. On the Jeffreys scale this is "strong" evidence for the alternative. BF10 between 1 and 3 is anecdotal, 3 to 10 is moderate, above 100 is decisive. Interpret cautiously when priors are weakly justified.

How is a Bayes Factor different from a p-value?

A p-value tells you how surprising your data would be if the null were true; it cannot quantify evidence FOR the null. A Bayes Factor compares two models directly and can support either side, including "the data favor the null." Use BF when you need to claim no effect, not just fail to reject.

Which prior should I use for a Bayes Factor t-test?

A Cauchy prior on the standardized effect size with scale 1/sqrt(2) (about 0.707) is the default for the BayesFactor R package and reflects medium-sized effects. Use a wider scale if you expect large effects, narrower if you expect small ones. The calculator shows how BF10 changes as you vary the prior.

Bayes Factor Calculator

A Bayes factor measures how much your data favor one hypothesis over another, like saying the evidence is 5 times stronger for an effect than for no effect. Enter a test statistic or summary stats to get the BF, an evidence label (anecdotal to decisive), and the equivalent frequentist p-value side by side.

6 modes · summary stats or t-statistic · JZS Cauchy · stretched-beta · Beta-Binomial · Runs in your browser

Try a real-world example to load.

📝 Scenario

Output

BF₁₀

Recap

R code RUNNABLE

R Reproduce in R

Sensitivity to prior INTERACTIVE

Bayes factor vs Cauchy scale r (log-y)

Inference

We computed how strongly your data favor the alternative hypothesis over the null, expressed on the Jeffreys evidence scale.

Read more The Bayes factor math, end to end

BF₁₀ = ∫ p(t | g, ν) π(g) dg / p(t | g=0, ν) π(g) = inverse-gamma(0.5, r²/2)

JZS for the t-test (Rouder 2009). Effect size δ sits on a Cauchy(0, r) prior, equivalently δ | g ~ N(0, g) with g ~ inverse-gamma. The numerator marginalises the noncentral t density over g; the denominator is the central t density at the observed t. We integrate on a log-g grid by Simpson’s rule with a wide range; for the t test the integrand is unimodal and the rule is accurate to many digits.

nᵉ = n (one-sample), n₁ n₂ / (n₁+n₂) (two-sample) ν = n−1 (one-sample), n₁+n₂−2 (two-sample)

Effective n and df. The marginal likelihood depends on n through nᵉ (effective sample size) and on the t distribution’s df ν. Welch is approximated by using the pooled df and effective n; for moderate-to-large n the BF is robust to the unequal-variance correction. For the paired design, treat it as a one-sample t on the differences with n equal to the number of pairs.

P(H₁ | data) = BF₁₀ / (1 + BF₁₀) with 50/50 priors

Posterior probability. If you start out 50/50 between H₀ and H₁, the posterior probability of H₁ is BF₁₀ / (1 + BF₁₀). A BF of 10 puts you at 91% on H₁; a BF of 3 only 75%. With unequal priors, multiply BF by the prior odds to get posterior odds, then convert to a probability.

Correlation: stretched-beta on r BF₁₀ = ∫ p(r | ρ) π(ρ; κ) dρ / p(r | ρ=0)

Correlation BF (Wagenmakers 2016). The stretched-beta prior on the population correlation ρ assigns mass on (−1, 1) controlled by κ (mapped from the Cauchy scale: a wider scale gives a flatter prior). The correlation likelihood uses the exact density via incomplete beta. We integrate on ρ with adaptive Simpson’s rule.

Two-proportion: independent Beta(1,1) priors on p₁, p₂ BF₁₀ = m₁(x₁,n₁) m₁(x₂,n₂) / m₀(x₁+x₂, n₁+n₂)

Two-proportion BF. Under H₁ each group has its own beta-binomial marginal likelihood; under H₀ they share a common rate, integrated against a single Beta(1,1) prior. The closed form uses log-Beta functions, so it is fast and exact. A flat prior is a defensible default; tighter priors can be motivated by domain knowledge.

ANOVA / regression (Liang 2008): R² = F · df1 / (F · df1 + df2) BF₁₀ = ∫ (1+g)^((N−p−1)/2) (1 + g(1−R²))^(−(N−1)/2) π(g) dg π(g) = inverse-gamma(1/2, N r² / 2)

ANOVA and linear regression BF. The same JZS / Zellner-Siow g-prior covers both fixed-effects ANOVA and linear regression; the only difference is bookkeeping for p (number of model parameters). The Bayes factor compares the full model against the intercept-only null. We integrate the unimodal integrand on log(g) by Simpson’s rule, exactly as for the t-test. With the medium prior r = 0.5 the result tracks BayesFactor::anovaBF and regressionBF closely.

Caveats When this is the wrong tool

If you have…: Use instead
A complex hierarchical or mixed model: The JZS prior is built for fixed-effect designs. For mixed effects, the BayesFactor::lmBF() family handles random terms with explicit variance priors; for anything bespoke, fit in Stan or brms and compute BFs by bridge sampling.
Tiny n (under 5 per group): Numerical integration is fine but the prior is doing most of the work. Report the prior, run the sensitivity plot, and treat the BF as descriptive rather than decisive.
One-sided hypotheses: This tool uses two-sided priors. For directional H₁ (effect > 0), truncate the Cauchy at zero and double-up the integral; we plan to ship this as a v2 toggle.
A planning question (sample size): BFs are post-hoc evidence summaries. For prospective design, compute the Bayes factor design analysis (BFDA) by simulation, or fall back on classical power. The Power analysis tool handles the latter.
Strong informative priors from the literature: JZS is a default, not a substitute for the right prior. If you know the field-level effect-size distribution, encode it directly with a normal or t prior on δ and integrate accordingly.
Categorical / count outcomes beyond 2x2: Use a Bayesian contingency-table model or a Poisson Bayes factor; the JZS construction does not generalise to counts without extra structure.