readr vs read.csv vs fread: Which Data Import Function Is Fastest?
R has three popular CSV readers: base R's read.csv(), readr's read_csv(), and data.table's fread(). For files under 1 MB, it doesn't matter. For 100 MB+, fread() can be 10–40x faster than read.csv().
The benchmarks are clear: fread() wins on raw speed, read_csv() wins on tidyverse integration, and read.csv() wins on zero dependencies. This guide gives you the numbers and tells you which to pick.
Feature Comparison
| Feature | read.csv() |
read_csv() |
fread() |
|---|---|---|---|
| Package | base R | readr | data.table |
| Speed (1 GB file) | ~60 sec | ~10 sec | ~3 sec |
| Output type | data.frame | tibble | data.table |
| Strings → factor | Yes (old R default) | No | No |
| Auto-detect delimiter | No | No | Yes |
| Auto-detect encoding | No | No | Yes |
| Column type guessing | Basic | Smart | Smart |
| Progress bar | No | Yes | Yes |
| Select columns on read | No | col_select= |
select= |
| Skip/limit rows | skip=, nrows= |
skip=, n_max= |
skip=, nrows= |
| Custom NA strings | na.strings= |
na= |
na.strings= |
| Parallel reading | No | No | Yes |
| Memory efficiency | Low | Medium | High |
| Dependencies | None | readr | data.table |
Speed Benchmark
On large files (100 MB – 4 GB), benchmarks consistently show
fread()at 5–40x faster thanread.csv()and 2–8x faster thanread_csv(). The gap grows with file size becausefread()uses parallel parsing and memory-mapped I/O.
Syntax Side by Side
When to Use Each
| Scenario | Best choice | Why |
|---|---|---|
| Zero dependencies needed | read.csv() |
Always available, no install |
| Tidyverse workflow | read_csv() |
Returns tibble, clean messages, pipe-friendly |
| Large files (100 MB+) | fread() |
Fastest, parallel parsing, memory efficient |
| Unknown delimiter | fread() |
Auto-detects delimiter (comma, tab, pipe, etc.) |
| CRAN package development | read.csv() |
No external dependency |
| Teaching beginners | read_csv() |
Helpful messages, consistent API |
| Quick data exploration | fread() |
Auto-detects everything, minimal arguments |
Writing: write_csv vs write.csv vs fwrite
Speed differences apply to writing too.
Practice Exercises
Exercise 1: Compare Outputs
Read the same CSV with read.csv and read_csv. Compare the class and column types.
Click to reveal solution
```rSummary
| Function | Speed | Best for |
|---|---|---|
read.csv() |
Slowest | Zero-dependency scripts |
read_csv() |
3–5x faster | Tidyverse workflows |
fread() |
5–40x faster | Large files, speed-critical code |
Use **read_csv() as your everyday default. Switch to fread() for files over 100 MB. Use read.csv()** only when you can't install any packages.
FAQ
Can I use fread() output with dplyr?
Yes. fread() returns a data.table which inherits from data.frame, so dplyr verbs work directly. For full tidyverse compatibility, convert with as_tibble(). Or use the dtplyr package for dplyr syntax with data.table speed.
Why is fread() so much faster?
Three reasons: (1) parallel C-level parsing across multiple threads, (2) memory-mapped file reading that avoids copying data, and (3) intelligent sampling to determine column types without scanning the entire file.
Does read_csv handle compressed files?
Yes. read_csv("data.csv.gz") automatically decompresses gzip, bzip2, and xz files. Same for fread(). Base read.csv() needs gzfile("data.csv.gz") wrapper.
What's Next?
- Importing Data in R — the parent tutorial covering all formats
- Apache Arrow in R — for even faster I/O with Parquet files
- Pipe Operator — chain data import with transformation