R vs Stata: The Economist's Complete Guide to Choosing Your Tool
Stata has been the workhorse of empirical economics for decades. R is gaining ground with its free license, visualization power, and expanding econometrics ecosystem. This guide compares both tools through an economist's lens — with a focus on the methods you actually use.
If you're an economics PhD student choosing your primary tool, or a working economist considering whether to add R to your toolkit, this comparison covers everything that matters: econometric capabilities, workflow, cost, and what the job market rewards.
Cost and Access
| Factor | R | Stata |
|---|---|---|
| Student license | Free | $48/year (IC), $98/year (SE) |
| Perpetual license | Free | $295 (IC), $595 (SE), $1,195 (MP) |
| Annual license | Free | $125 (IC), $255 (SE), $545 (MP) |
| After graduation | Still free | Need new license if not perpetual |
| Coauthors can verify | Yes (always) | Only if they have Stata |
Stata editions: IC (limited variables), SE (standard), MP (multiprocessor). R has no feature restrictions.
Stata advantage: A perpetual SE license at $595 is reasonable for a single researcher. No packages to install for core econometric methods.
R advantage: Free means no barriers. Every coauthor, reviewer, and student can run your code without purchasing anything.
Econometric Methods Comparison
This is what matters most. Here's how the two compare for the methods economists use daily.
| Method | Stata | R Package |
|---|---|---|
| OLS regression | regress |
lm() |
| Robust standard errors | vce(robust) |
sandwich + lmtest |
| Clustered SEs | vce(cluster id) |
sandwich::vcovCL(), fixest |
| Fixed effects | xtreg, fe / reghdfe |
fixest::feols(), plm |
| Instrumental variables | ivregress 2sls |
fixest::feols(), ivreg |
| Difference-in-differences | diff / did_multiplegt |
did, fixest, did2s |
| Regression discontinuity | rdrobust |
rdrobust, rddensity |
| Panel data | xtreg, xtabond |
plm, fixest |
| Quantile regression | qreg |
quantreg |
| Survival/duration models | stcox, streg |
survival |
| Synthetic control | synth |
Synth, tidysynth, augsynth |
| Propensity score matching | psmatch2, teffects |
MatchIt, cobalt |
| Event studies | eventdd |
fixest, did2s |
| Spatial econometrics | Limited | spdep, spatialreg |
The fixest Game-Changer
The fixest package by Laurent Berge has transformed R econometrics. It handles fixed effects, instrumental variables, clustered standard errors, and multi-way fixed effects — all faster than Stata's reghdfe.
Workflow and Reproducibility
Stata workflow: Write .do files, produce .log files. Well-established in economics. Everyone knows how to read a .do file.
R workflow: Write .R scripts or R Markdown/.qmd documents that combine code, output, and text.
| Aspect | Stata | R |
|---|---|---|
| Script files | .do files | .R files |
| Reproducibility | .do + .log files | R Markdown, Quarto (code + output + text) |
| Version control | Works, but less common | Git/GitHub deeply integrated |
| Package management | Built-in ssc install |
renv for project-specific libraries |
| Tables to LaTeX | esttab, outreg2 |
modelsummary, fixest::etable(), stargazer |
| Figures | Adequate but limited | ggplot2 (publication-quality) |
Regression Tables
Both tools can export regression tables to LaTeX, but R's options are more flexible:
Data Visualization
This is R's biggest advantage over Stata. Economics papers increasingly require high-quality figures, and ggplot2 is unmatched.
Stata's graphics have improved, but customizing them remains tedious compared to ggplot2's layered grammar.
Performance Benchmarks
For large datasets, speed matters. Here's how they compare for common operations:
| Operation | Stata (reghdfe) | R (fixest) | Winner |
|---|---|---|---|
| OLS, 1M obs, 2 FE | ~5 sec | ~1 sec | R |
| OLS, 10M obs, 2 FE | ~60 sec | ~8 sec | R |
| IV, 1M obs, 1 FE | ~8 sec | ~2 sec | R |
| Clustered SEs | Fast | Fast | Tie |
fixest is often 5-10x faster than reghdfe for high-dimensional fixed effects regressions. This matters when you're running hundreds of specifications.
What Top Economics Departments Use
The trend is clear: R adoption is growing rapidly in economics, while Stata remains the default.
Stata strongholds:
- Most applied microeconomics research
- Labor economics, development economics, public finance
- The majority of published empirical economics papers
R adoption areas:
- Macroeconomics and structural estimation
- Causal inference research (many new methods are R-first)
- Data visualization for publications
- Researchers who also do machine learning
- Central banks (ECB, Bank of England, Federal Reserve)
Both: Many economists use both. They do data cleaning and quick regressions in Stata, then produce figures in R.
The Case for Stata
Choose Stata if you:
- Work in an economics department where everyone uses Stata
- Need to collaborate with coauthors who only know Stata
- Want every core econometric method available without installing packages
- Value Stata's official documentation (it's genuinely excellent)
- Do primarily applied microeconomics
The Case for R
Choose R if you:
- Want free software that works everywhere
- Need publication-quality graphics
- Work with large datasets where performance matters
- Want to use modern causal inference methods (many are R-first)
- Plan to do machine learning alongside econometrics
- Want reproducible research workflows with R Markdown/Quarto
- Work at a central bank, think tank, or tech company
Migration Tips: Stata to R
| Stata Command | R Equivalent | |
|---|---|---|
use "data.dta" |
haven::read_dta("data.dta") |
|
gen x = ... |
dplyr::mutate(x = ...) |
|
keep if ... |
dplyr::filter(...) |
|
collapse (mean) y, by(group) |
`dplyr::group_by(group) | > summarise(y = mean(y))` |
merge 1:1 |
dplyr::left_join() |
|
reshape long/wide |
tidyr::pivot_longer() / tidyr::pivot_wider() |
|
regress y x1 x2 |
lm(y ~ x1 + x2, data = df) |
|
esttab |
modelsummary() or fixest::etable() |
|
graph export |
ggsave() |
FAQ
Q: Can R read Stata .dta files? A: Yes. haven::read_dta() reads all Stata file versions, preserving variable labels and value labels. The labelled package helps work with labeled data in R.
Q: Is fixest really faster than reghdfe? A: Yes, consistently. Benchmarks show fixest is 5-10x faster for high-dimensional fixed effects. It also handles instrumental variables, Poisson regression, and negative binomial models with the same fast engine.
Q: My advisor uses Stata. Should I still learn R? A: Learn both. Use Stata for collaboration with your advisor, and learn R for visualization, reproducibility, and career flexibility. The investment in R will pay off whether you stay in academia or move to industry.
What's Next
- Is R Worth Learning in 2026? -- The broader case for R across all fields
- R vs Python -- The other major language comparison
- R Interview Questions -- Prepare for data science interviews with R