Rr‑statistics.co

Bootstrap CI Calculator

Bootstrapping resamples your data thousands of times to build a confidence interval for any statistic, even ones with no closed-form formula (medians, ratios, custom functions). Paste your numbers, pick a statistic, and get percentile, basic, or BCa intervals along with reproducible boot::boot() R code.

i New to bootstrapping? Read the 4-min primer

What the bootstrap does. Resample your data with replacement many times, recompute the statistic on each resample, and use the spread of those values as a stand-in for the sampling distribution. The bootstrap turns "I have one sample" into "here is what the statistic would have looked like across many samples", without assuming a parametric form.

Three ways to get the CI from the resamples. The percentile CI sorts the bootstrap distribution and takes the α/2 and 1−α/2 quantiles. The basic CI (a.k.a. reverse percentile) reflects the distribution about the original estimate. The BCa method adjusts for bias and skewness using a jackknife-derived acceleration; it is the methodologically best-practice default for moderately skewed statistics.

Bias and SE. Bias is the mean of the bootstrap distribution minus the original statistic; if it is large relative to the SE the parametric CI is suspect. The bootstrap SE is the SD of the bootstrap distribution and is a drop-in replacement for the analytic standard error.

When the bootstrap struggles. Very small n (under 10), heavy-tailed data where the variance is infinite, or statistics that depend on a single extreme order value (like the maximum). For those cases, increase B, use BCa rather than percentile, and treat the CI as approximate.

6 worked examples · raw data only · percentile · basic · BCa · Runs in your browser

Try a real-world example to load.

📝 Scenario

Bootstrap CI
to
Recap
R code RUNNABLE
R Reproduce in R

        
Bootstrap distribution INTERACTIVE
Inference

Read more The bootstrap math, end to end
x* = sample(x, n, replace=TRUE) θ* = stat(x*) repeat B times
Resampling. Draw n observations from the original sample with replacement, compute the statistic on the resample, store it. After B repetitions you have the bootstrap distribution θ*₁, θ*₂, …, θ*ᵀ. The seeded Mulberry32 RNG makes results reproducible across reloads.
CI_perc = (θ*_(α/2 · B), θ*_((1-α/2) · B))
Percentile. Sort the bootstrap distribution; read off the α/2 and 1−α/2 quantiles. Simple, but ignores any bias in the bootstrap distribution.
CI_basic = (2θ̂ − θ*_((1-α/2) · B), 2θ̂ − θ*_(α/2 · B))
Basic (reverse percentile). Reflects the percentile bounds about the original estimate. Equivalent to assuming the bootstrap distribution of θ* − θ̂ mirrors the sampling distribution of θ̂ − θ.
z₀ = Φ⁻¹(#{θ* < θ̂} / B) a = Σ(ḥ − θ_(−i))³ / [6 (Σ(ḥ − θ_(−i))²)^(3/2)] α_lo = Φ(z₀ + (z₀ + z_(α/2)) / (1 − a(z₀ + z_(α/2)))) α_hi = Φ(z₀ + (z₀ + z_(1−α/2)) / (1 − a(z₀ + z_(1−α/2))))
BCa (Efron 1987). z₀ is the bias correction; it shifts the percentiles to account for the bootstrap mean differing from the point estimate. The acceleration a comes from the jackknife: leave one observation out, recompute the statistic, and look at the third-versus-second moment of those n leave-one-out estimates. BCa is second-order accurate and almost always preferable to percentile when the bootstrap distribution is skewed.
bias = mean(θ*) − θ̂ SE = sd(θ*)
Bias and SE. Bias is the mean of the bootstrap distribution minus the original statistic; for an unbiased estimator it should be near zero. The bootstrap SE is the SD of the bootstrap distribution and replaces the parametric standard error in non-Gaussian settings.
Caveats When the bootstrap is the wrong tool
If you have…
Use instead
n < 10 with no model
The bootstrap needs enough information in the sample to mimic the population. With n < 10 the resamples are too repetitive. Try a fully parametric model with a likelihood-based CI.
A statistic depending only on the maximum or minimum
Order-statistic-driven statistics have non-smooth sampling distributions; the bootstrap can be inconsistent. Use the parametric extreme-value asymptotics or subsampling.
Time-series or autocorrelated data
The independent-resample assumption is broken. Use a block bootstrap (moving / circular / stationary) sized to the autocorrelation range.
A regression coefficient
The bootstrap CI is fine, but for clean inference use the model-based CI from lm output interpreter or its glm sibling. For non-Gaussian residuals, residual or wild bootstrap is preferred.
A simple mean of mildly normal data
The parametric t-CI is exact under normality, faster, and gives the same answer to 2 decimals. Use the Confidence Interval Calculator.
Heavy-tailed data with infinite variance (Cauchy-like)
The bootstrap will systematically underestimate the spread. Use a truncated-mean estimator or shift to a robust scale (MAD, Qₙ).
Further reading

Math: seeded Mulberry32 RNG; percentile and basic CIs from the sorted resample distribution; BCa via the Efron (1987) bias correction plus jackknife acceleration, α-adjusted with Wichura AS 241 inverse normal.