Bayes Factor Calculator
A Bayes factor measures how much your data favor one hypothesis over another, like saying the evidence is 5 times stronger for an effect than for no effect. Enter a test statistic or summary stats to get the BF, an evidence label (anecdotal to decisive), and the equivalent frequentist p-value side by side.
New to Bayes factors? Read the 4-min primer ▾
What a Bayes factor is. The Bayes factor BF₁₀ is the ratio of the data’s marginal likelihood under the alternative H₁ to its marginal likelihood under the null H₀. Read it as “the data are BF₁₀ times more likely under H₁ than under H₀.” A BF of 10 says the evidence favours H₁ ten-to-one; a BF of 0.1 says the evidence favours H₀ ten-to-one. There is no “significance” threshold: the BF is the evidence, full stop.
How to read the number. Jeffreys and Wagenmakers give a rough ladder: above 100 is extreme, 30 to 100 very strong, 10 to 30 strong, 3 to 10 moderate, 1 to 3 anecdotal. Below 1, mirror the labels for evidence in favour of the null. The big swing happens between 3 and 10; below 3, treat the evidence as weak no matter which side it leans.
Picking a prior. The Bayes factor needs a prior on the effect under H₁. The JZS Cauchy is the standard default with scale 0.707 (medium). A wider scale of 1.0 or 1.414 expects bigger effects and so penalises small ones more harshly under H₁; a narrower 0.5 expects smaller effects and rewards them. Always check sensitivity by varying the scale; the plot below does this for you.
BF versus p-value. A p-value asks “how surprised would H₀ be by data this extreme or more?”. A Bayes factor asks “which hypothesis predicted the data better?”. P-values reject; Bayes factors quantify. They can disagree, especially at large n: a tiny effect may be highly significant yet have a Bayes factor near 1, because both H₀ and H₁ predict it about equally well.
Try a real-world example to load.
Recap
We computed how strongly your data favor the alternative hypothesis over the null, expressed on the Jeffreys evidence scale.
Read more The Bayes factor math, end to end
Caveats When this is the wrong tool
- If you have…
- Use instead
- A complex hierarchical or mixed model
- The JZS prior is built for fixed-effect designs. For mixed effects, the
BayesFactor::lmBF()family handles random terms with explicit variance priors; for anything bespoke, fit in Stan or brms and compute BFs by bridge sampling. - Tiny n (under 5 per group)
- Numerical integration is fine but the prior is doing most of the work. Report the prior, run the sensitivity plot, and treat the BF as descriptive rather than decisive.
- One-sided hypotheses
- This tool uses two-sided priors. For directional H₁ (effect > 0), truncate the Cauchy at zero and double-up the integral; we plan to ship this as a v2 toggle.
- A planning question (sample size)
- BFs are post-hoc evidence summaries. For prospective design, compute the Bayes factor design analysis (BFDA) by simulation, or fall back on classical power. The Power analysis tool handles the latter.
- Strong informative priors from the literature
- JZS is a default, not a substitute for the right prior. If you know the field-level effect-size distribution, encode it directly with a normal or t prior on δ and integrate accordingly.
- Categorical / count outcomes beyond 2x2
- Use a Bayesian contingency-table model or a Poisson Bayes factor; the JZS construction does not generalise to counts without extra structure.
- Hypothesis testing in R – the frequentist counterpart, with the same inputs framed as t / df / p.
- t-test calculator – the frequentist sibling tool: paste the same numbers and compare verdicts.
- Confidence intervals in R – what a BF gives you that a CI does not, and vice versa.
- Effect size in R – the underlying d, r, OR scale that the JZS prior places mass on.
- Power analysis tool – the prospective companion: pick n before you collect, then summarise with a BF after.
Math: JZS Cauchy prior with adaptive Simpson on log-g; central-t CDF via regularised incomplete beta; correlation BF via stretched-beta on rho; two-proportion BF via beta-binomial marginal likelihoods; sensitivity plot rebuilds across r in [0.1, 2].