readr read_fwf() in R: Read Fixed-Width Files
The readr read_fwf() function reads a fixed-width file into a tibble, where each column is fixed by character position instead of a delimiter. You describe the layout once with a helper like fwf_widths() and read_fwf() parses every row to match.
read_fwf("data.txt", fwf_empty("data.txt")) # guess column edges
read_fwf("data.txt", fwf_widths(c(3, 5, 8))) # set field widths
read_fwf("data.txt", fwf_widths(c(3, 5), c("id", "name"))) # widths plus names
read_fwf("data.txt", fwf_positions(c(1, 4), c(3, 11))) # start and end columns
read_fwf("data.txt", fwf_cols(id = 3, name = 8)) # named widths
read_fwf("data.txt", fwf_cols(id = c(1, 3), name = c(4, 11)))# named ranges
read_fwf("data.txt", fwf_widths(c(3, 5)), skip = 2) # skip junk header linesNeed explanation? Read on for examples and pitfalls.
What read_fwf() does
read_fwf() reads a fixed-width file into a tibble. A fixed-width file has no separator between fields. Each column always starts and ends at the same character position on every line, so the layout itself defines the structure. You give read_fwf() a file path, a URL, or literal text, plus a column specification that names those positions.
The column specification comes from one of four helper functions: fwf_empty(), fwf_widths(), fwf_positions(), and fwf_cols(). They differ only in how you describe the layout. Once read_fwf() knows where each field sits, it slices every row, guesses each column type, and returns a tidy data frame.
Syntax and key arguments
The call always pairs a file with a column specification. The col_positions argument is what makes read_fwf() different from the delimited readers; it carries the layout.
The skip, na, and col_types arguments behave exactly as they do in read_csv(). The only new idea is col_positions, and the four fwf_* helpers below all produce a valid value for it.
read_fwf("data.txt", fwf_widths(c(3, 5))) is pandas.read_fwf("data.txt", widths=[3, 5]). The pandas colspecs argument maps to readr's fwf_positions().read_fwf() examples
Start with a layout you know. This file has three fields: a 7-character name, a 2-character age, and a 9-character city, with no separators. fwf_widths() takes those widths and the column names.
Wrapping the string in I() tells read_fwf() the value is data, not a file path. The widths c(7, 2, 9) cover columns 1 to 7, 8 to 9, and 10 to 18.
Describe the same file with start and end positions. fwf_positions() takes a vector of start columns and a vector of end columns. Both are 1-based and inclusive.
This is the natural choice when a data dictionary lists each field by its byte range, which is common in legacy mainframe and government extracts.
Name the columns inline with fwf_cols(). Pass each column as name = width, and read_fwf() builds the positions for you. It is the most readable helper when widths and names belong together.
Let readr guess the edges with fwf_empty(). When every column is separated by at least one all-space character, fwf_empty() finds the boundaries automatically. You only supply the column names.
Notice the id column came back as 1, 2, 3: the "001" text parsed as a number and dropped the leading zeros. The pitfalls section below shows how to keep them.
fwf_positions() only when a data dictionary already lists byte ranges.Defining columns: the four fwf_ helpers
Every read_fwf() call needs a column specification, and the helper you pick depends on what you know. All four return the same kind of object, so they are interchangeable once built.
| Helper | You provide | Best when |
|---|---|---|
fwf_empty() |
the file, plus column names | columns are separated by whitespace |
fwf_widths() |
a width for each field | you know how wide each column is |
fwf_positions() |
start and end of each field | a data dictionary lists byte ranges |
fwf_cols() |
named widths or named ranges | you want names and positions together |
Use fwf_empty() for a quick first look at a clean file. Switch to fwf_widths() or fwf_cols() once you have the real layout, because an explicit spec never guesses wrong and documents the format for the next reader.
Common pitfalls
Leading zeros disappear. Identifier columns like "001" look numeric, so readr parses them as doubles and drops the zeros. Force the column to text with col_types.
fwf_empty() merges touching columns. fwf_empty() only finds a boundary where every row has a space. If two fields ever touch, such as a name running straight into an age, the guess merges them. Use fwf_widths() or fwf_positions() for files with no gaps.
Off-by-one positions. fwf_positions() uses inclusive 1-based columns. A field spanning the first seven characters is start = 1, end = 7, not end = 8. A single-column slip shifts every field after it, so the data still reads without an error but lands in the wrong column.
Try it yourself
Try it: Use fwf_widths() to read the fixed-width string below into ex_data, with a 6-character name column and a 3-character age column. Then save the mean age to ex_mean.
Click to reveal solution
Explanation: fwf_widths(c(6, 3), ...) slices columns 1 to 6 as name and 7 to 9 as age. read_fwf() guesses age as a double, so mean() works directly on ex_data$age.
Related readr functions
read_fwf() handles the one format with no separator; reach for a sibling when the file has one.
read_table(): read whitespace-separated files where columns are ragged.read_delim(): read files with any single-character delimiter.read_csv(): read standard comma-separated files.fwf_cols(): build a named column specification inline.read_lines(): read raw lines when no fixed layout fits.
For the full argument reference, see the readr read_fwf documentation on tidyverse.org.
FAQ
What is a fixed-width file?
A fixed-width file is a plain text file where every field occupies the same character positions on every line, with no delimiter between fields. A name might always sit in columns 1 to 20 and an age in columns 21 to 23. The layout itself, not a separator, defines the columns, so you need a column specification to read it correctly.
How do I read a fixed-width file in R?
Call read_fwf() with the file and a column specification. If the columns are separated by whitespace, read_fwf("data.txt", fwf_empty("data.txt")) guesses the edges. When you know the widths, read_fwf("data.txt", fwf_widths(c(10, 3), c("name", "age"))) is exact and self-documenting.
What is the difference between read_fwf() and read_table()?
read_fwf() reads files where columns sit at fixed character positions, even when fields touch with no gap. read_table() reads files where columns are separated by one or more spaces and may be ragged. Use read_fwf() when the layout is positional, and read_table() when whitespace reliably separates every field.
How do I keep leading zeros when reading a fixed-width file?
Pass col_types so the column reads as text. For example, read_fwf(file, spec, col_types = cols(id = col_character())) keeps an identifier like "007" intact. Without it, readr guesses the column is numeric and stores 7, dropping the zeros.
Can read_fwf() guess column positions automatically?
Yes, through fwf_empty(), which scans the file for columns of all-space characters and treats them as boundaries. It works well on clean files with clear gaps. It fails when two fields touch, because there is no space to mark the edge, so an explicit fwf_widths() spec is safer for production code.