Rr‑statistics.co

Reprex Builder

A reprex is the minimal, self-contained R code someone needs to reproduce your problem on their machine. It is the first thing people ask for on Stack Overflow. Paste your messy code; the builder hoists library() calls, flags undefined data references, formats output, and produces a clean snippet ready to share.

i New to reprex? Read the 4-min primer

What a reprex is. A REProducible EXample is a tiny, self-contained snippet of R that anyone can paste into a fresh session and run end-to-end without setup. The point is to isolate the bug or question so a helper can focus on it instead of guessing what your data, packages, or working directory look like.

Why it matters when you ask for help. Stack Overflow, GitHub issues, and R mailing lists routinely close questions that lack a reprex. A good reprex turns a vague "this doesn't work" into a one-paste reproduction that someone halfway around the world can confirm in 30 seconds. You also tend to solve your own problem mid-way through writing one.

The four principles. A reprex should be minimal (cut everything not needed to trigger the issue), self-contained (no external files, no working directory assumptions), reproducible (same output every run, set the seed if random), and clean (load the libraries you actually use, drop the ones you don't).

How this builder helps. Paste your code, toggle the options for session info, comment-style output, seed, and Markdown wrapping, then read the lints. The cleaned snippet on the right is what you copy into the question; the lint cards on the left explain what was changed and what still needs your attention.

Reprex builder · Lints · dput suggestions · Markdown wrap · Runs in your browser

Try a real-world example to load.

Paste R code on the left to generate a cleaned reprex.
# Awaiting paste…
Lints will appear here.
Verify in R RUNNABLE
R Run the cleaned reprex

        
Lint summary CHECKLIST
Paste R code to see a lint summary.
Inference

Read more Anatomy of a good reprex
Hoist library calls library(pkg1) library(pkg2) … (rest of code)
Libraries first. A reader needs to know which packages must be installed before they can paste your snippet. Move every library() and require() to the top, even if your real script loads them later. Drop any that aren't actually used by the failing line.
Inline the data df <- structure(list(x = c(…), y = c(…)), class = "data.frame", row.names = c(NA, -n))
Self-contained data. Replace any read.csv("…") or reference to your saved object with a tiny dput(head(df, 6)) block. Six rows is usually enough to trigger anything that's not a sample-size issue. For a sample-size issue, set the seed and simulate.
Pin randomness set.seed(42) … (random call here)
Reproducible randomness. Anything that draws random numbers (sample, rnorm, runif, kmeans, randomForest, MCMC) needs a seed before it. Without one, the helper won't see the same numbers you do, and "the model gives a different result" becomes "the model works fine for me."
Show the failure fit <- lm(…) summary(fit) #> Error in …: object 'badname' not found
Include the error verbatim. Paste the full error message as a comment line right where it appears. The exact text is searchable; a paraphrase is not. If your code throws a warning rather than an error, capture that too. Don't strip the message because it looks ugly.
Optional: session info sessionInfo()
When to include it. Add sessionInfo() only when the bug is plausibly version-dependent (a recent package update, a platform-specific issue, or you're filing a GitHub issue against a package). Otherwise it's noise; the helper rarely needs to know your exact tibble patch version to answer a dplyr question.
Caveats When this is the wrong tool
If you have…
Use instead
A bug that needs your full pipeline to reproduce
Strip the pipeline by hand. A reprex builder cannot guess which intermediate step is load-bearing. Bisect: cut half the steps; does it still fail? Repeat until you find the smallest snippet that fails.
A platform-specific or installation issue
The reprex is the wrong shape. File a question with your sessionInfo(), the install command you ran, the exact error from the install log, and your OS / R / Rtools version. Code transformation won't help.
Sensitive or proprietary data
Don't paste real data anywhere. Build a fake dataset that has the same structure (column names, types, value ranges) but synthetic content, then use that in the reprex. The bug usually depends on shape and types, not on the literal numbers.
A bug in interactive output (RStudio panes, Shiny widgets)
A code reprex covers half of it. Add a screenshot or screen recording of the misbehaving UI, plus the exact click sequence. Shiny issues need a self-contained app.R with a fake dataset.
An issue that only appears at scale
Set a seed and simulate the smallest dataset that still triggers the issue (e.g. "fails when n = 1e5"). If the issue is genuinely about runtime, profile with profvis and share the profile, not the data.
Further reading

Implementation notes: the parser is a lightweight regex tokenizer (not a full R AST). It handles common cases (library calls, assignments, file reads, plot calls, dataset references) but won't catch every edge case. Always eyeball the output before pasting it into a question.