R Data Types: Which Type Is Your Variable? (And Why It Matters)

R has six basic data types: numeric, integer, character, logical, complex, and raw. Every value in R belongs to one of these types, and the type determines what operations you can perform on it — add numbers, join text, or filter with TRUE/FALSE.

Data types sound abstract until they cause bugs. You try to add two numbers and R throws an error because one is secretly a character string. You read a CSV file and your "numeric" column turns out to be character because of one rogue entry. Understanding types prevents these problems.

This tutorial shows you every R data type with examples you can run, how to check and convert types, and the hidden coercion rules R uses when you mix types together.

Introduction

A data type tells R what kind of value a variable holds. Just as you can't add "hello" + 5 in real life, R can't add a character string to a number. Types are R's way of keeping track of what operations make sense.

Here's a quick overview of all six types — then we'll explore each one in detail:

Type	Example	What it stores	How common
numeric (double)	`3.14`, `42`	Decimal numbers	Very common
integer	`42L`	Whole numbers	Common
character	`"hello"`	Text strings	Very common
logical	`TRUE`, `FALSE`	Boolean values	Very common
complex	`3+2i`	Complex numbers	Rare
raw	`charToRaw("A")`	Raw bytes	Very rare

You'll use the first four types daily. Complex and raw are specialized — we'll cover them briefly for completeness.

Numeric: The Default Number Type

When you type a number in R, it's numeric (also called "double" because it uses double-precision floating point internally). This is R's default type for all numbers — even ones without decimal points:

# All of these are numeric x <- 42 pi_approx <- 3.14159 negative <- -7.5 big_number <- 1.5e6 # Scientific notation: 1,500,000 cat("x:", x, "— type:", class(x), "\n") cat("pi:", pi_approx, "— type:", class(pi_approx), "\n") cat("negative:", negative, "— type:", class(negative), "\n") cat("big:", big_number, "— type:", class(big_number), "\n") # Surprise: even 42 (no decimal) is numeric, not integer cat("\nis.numeric(42):", is.numeric(42), "\n") cat("is.integer(42):", is.integer(42), "\n")

Notice that 42 — a whole number — is still numeric, not integer. This catches many beginners off guard. In R, you must explicitly request an integer with the L suffix (covered next).

Numeric precision

R's numeric type uses 64-bit double-precision floating point, which gives you about 15-16 significant digits. For almost all data analysis, this is more than enough. But floating-point arithmetic can produce tiny rounding errors:

# Floating-point surprise cat("0.1 + 0.2 == 0.3:", 0.1 + 0.2 == 0.3, "\n") cat("0.1 + 0.2:", 0.1 + 0.2, "\n") cat("Actual value:", sprintf("%.20f", 0.1 + 0.2), "\n") # Use all.equal() for safe numeric comparison cat("all.equal(0.1 + 0.2, 0.3):", all.equal(0.1 + 0.2, 0.3), "\n")

0.1 + 0.2 is not exactly 0.3 due to how computers store decimal numbers. This isn't an R bug — it happens in every programming language. Use all.equal() instead of == when comparing floating-point numbers.

Integer: Whole Numbers with L

Integers are whole numbers. In R, you create them by adding an L suffix:

# Creating integers count <- 42L year <- 2026L zero <- 0L cat("count:", count, "— type:", class(count), "\n") cat("year:", year, "— type:", class(year), "\n") # Without L, it's numeric (double), not integer x <- 42 # numeric y <- 42L # integer cat("\n42 type:", class(x), "\n") cat("42L type:", class(y), "\n") cat("Are they equal?", x == y, "\n") # TRUE — same value, different storage

When does integer vs numeric matter?

For most data analysis, it doesn't. R silently converts between them when needed. Integers matter when:

Memory efficiency — integers use 4 bytes, doubles use 8 bytes. For vectors with millions of elements, this adds up.
API requirements — some R functions or packages expect integers (e.g., sequence indices).
Reading external data — when R reads a CSV column of whole numbers, it stores them as integer by default.

# Integer vs numeric memory usage int_vec <- 1:1000000L # 1 million integers dbl_vec <- as.numeric(1:1000000) # 1 million doubles cat("Integer vector:", object.size(int_vec), "bytes\n") cat("Numeric vector:", object.size(dbl_vec), "bytes\n") cat("Integers use", round(object.size(int_vec) / object.size(dbl_vec) * 100), "% of the memory\n")

Character: Text Strings

Character values hold text. Enclose them in single or double quotes — both work identically:

# Creating character values name <- "Alice" greeting <- 'Hello, World!' empty <- "" number_as_text <- "42" # This is text, NOT a number cat("name:", name, "— type:", class(name), "\n") cat("greeting:", greeting, "\n") cat("empty string length:", nchar(empty), "\n") cat("number_as_text:", number_as_text, "— type:", class(number_as_text), "\n") # You CANNOT do math with character strings # This would error: number_as_text + 1 cat("\nCan we add 1 to '42'? No! It's text, not a number.\n") cat("Convert first:", as.numeric(number_as_text) + 1, "\n")

This is the most common source of type errors in R: a column that looks numeric but is actually character. One non-numeric entry (like "N/A" or "$100") in a CSV column turns the entire column into character.

Useful character functions

# String length cat("nchar('hello'):", nchar("hello"), "\n") # Combine strings cat("paste:", paste("Hello", "World"), "\n") cat("paste0:", paste0("Item", 1:3), "\n") # Change case cat("toupper:", toupper("hello"), "\n") cat("tolower:", tolower("HELLO"), "\n") # Substring cat("substr:", substr("RStudio", 1, 3), "\n") # Check if text contains a pattern cat("grepl:", grepl("World", "Hello World"), "\n") # Split text cat("strsplit:", strsplit("a,b,c", ","), "\n")

Logical: TRUE and FALSE

Logical values represent yes/no, true/false, on/off. R uses TRUE and FALSE (all uppercase, no quotes):

# Creating logical values is_active <- TRUE is_empty <- FALSE cat("is_active:", is_active, "— type:", class(is_active), "\n") cat("is_empty:", is_empty, "— type:", class(is_empty), "\n") # Logical values come from comparisons x <- 10 cat("\nx > 5:", x > 5, "\n") cat("x == 10:", x == 10, "\n") cat("x < 0:", x < 0, "\n") # Shortcuts: T and F work but are NOT recommended cat("\nT:", T, "\n") # Works but dangerous — T can be overwritten # T <- 42 # This would break T! TRUE cannot be overwritten.

Warning: R allows T and F as shortcuts for TRUE and FALSE. Don't use them. Someone (or you) might accidentally create a variable called T, breaking all code that relies on it. Always spell out TRUE and FALSE.

Logical values as numbers

R treats TRUE as 1 and FALSE as 0. This is incredibly useful:

# TRUE = 1, FALSE = 0 scores <- c(88, 72, 95, 61, 83, 77, 90) passing <- 75 passed <- scores >= passing cat("Passed:", passed, "\n") cat("Number who passed:", sum(passed), "\n") # sum() counts TRUEs cat("Proportion who passed:", mean(passed), "\n") # mean() gives proportion cat("Percentage:", mean(passed) * 100, "%\n")

This is one of R's most elegant features. sum(logical_vector) counts how many TRUE values there are. mean(logical_vector) gives you the proportion. You'll use this pattern constantly in data analysis.

Complex: Imaginary Numbers

Complex numbers have a real and imaginary part. R uses the i suffix for the imaginary component:

# Creating complex numbers z1 <- 3 + 2i z2 <- 1 - 4i cat("z1:", z1, "— type:", class(z1), "\n") cat("z2:", z2, "\n") cat("Sum:", z1 + z2, "\n") cat("Product:", z1 * z2, "\n") cat("Real part of z1:", Re(z1), "\n") cat("Imaginary part of z1:", Im(z1), "\n") cat("Modulus (magnitude):", Mod(z1), "\n")

Unless you work in engineering, physics, or signal processing, you'll rarely use complex numbers in R. They exist for completeness.

Raw: Bytes

The raw type stores raw bytes. It's used for low-level data handling — binary file I/O, encryption, or network protocols:

# Creating raw values r <- charToRaw("Hello") cat("Raw bytes:", r, "\n") cat("Back to text:", rawToChar(r), "\n") cat("Type:", class(r), "\n")

You will almost never use raw in normal data analysis. It's mentioned here only because it's one of R's six atomic types.

Checking Types: class(), typeof(), is.*()

R gives you three ways to check a value's type. Here's when to use each:

x <- 42L y <- 3.14 z <- "hello" w <- TRUE # class() — the most useful for daily work cat("class(42L):", class(x), "\n") cat("class(3.14):", class(y), "\n") cat("class('hello'):", class(z), "\n") cat("class(TRUE):", class(w), "\n") cat("\n") # typeof() — the internal storage type (more technical) cat("typeof(42L):", typeof(x), "\n") cat("typeof(3.14):", typeof(y), "\n") cat("typeof('hello'):", typeof(z), "\n") cat("typeof(TRUE):", typeof(w), "\n") cat("\n") # is.*() — ask a yes/no question about the type cat("is.numeric(42L):", is.numeric(x), "\n") # TRUE — integers are numeric cat("is.integer(42L):", is.integer(x), "\n") # TRUE cat("is.character('hello'):", is.character(z), "\n") cat("is.logical(TRUE):", is.logical(w), "\n")

Function	Returns	Use when
`class(x)`	Type name as string	You want to know the type
`typeof(x)`	Internal storage type	You're debugging memory or performance
`is.numeric(x)`	TRUE/FALSE	You want to check before doing math
`is.character(x)`	TRUE/FALSE	You want to check before text operations

Tip: is.numeric() returns TRUE for both numeric AND integer values. Use is.integer() or is.double() if you need to distinguish between them.

Converting Types: as.*()

Sometimes you need to convert a value from one type to another. R provides as.*() functions for this:

# Character to numeric price_text <- "29.99" price_num <- as.numeric(price_text) cat("Text to number:", price_num, "— type:", class(price_num), "\n") cat("Now we can do math:", price_num * 1.08, "(with 8% tax)\n\n") # Numeric to character age <- 30 age_text <- as.character(age) cat("Number to text:", age_text, "— type:", class(age_text), "\n\n") # Numeric to integer (and back) x <- as.integer(3.7) # Truncates, doesn't round! cat("as.integer(3.7):", x, "(truncated, not rounded)\n") cat("as.double(42L):", as.double(42L), "\n\n") # Logical to numeric cat("as.numeric(TRUE):", as.numeric(TRUE), "\n") cat("as.numeric(FALSE):", as.numeric(FALSE), "\n\n") # What happens with invalid conversion? bad <- as.numeric("hello") cat("as.numeric('hello'):", bad, "(NA = missing/impossible)\n")

Important notes:

as.integer() truncates (drops the decimal), it does NOT round. as.integer(3.9) gives 3, not 4.
Invalid conversions produce NA (R's "missing value") with a warning. This is how R tells you it couldn't convert.

Type Coercion: R's Automatic Conversions

When you mix types in a vector, R automatically converts everything to the most flexible type. This is called coercion, and it follows a strict hierarchy:

logical → integer → numeric → complex → character

The type on the right "wins" — it's more flexible and can represent the types to its left.

# Mixing logical and numeric → numeric mixed1 <- c(TRUE, FALSE, 42) cat("c(TRUE, FALSE, 42):", mixed1, "\n") cat("Type:", class(mixed1), "\n\n") # Mixing numeric and character → character mixed2 <- c(1, 2, "three") cat("c(1, 2, 'three'):", mixed2, "\n") cat("Type:", class(mixed2), "\n\n") # Mixing logical and character → character mixed3 <- c(TRUE, "hello") cat("c(TRUE, 'hello'):", mixed3, "\n") cat("Type:", class(mixed3), "\n\n") # The coercion hierarchy in action mixed4 <- c(TRUE, 42L, 3.14, "text") cat("c(TRUE, 42L, 3.14, 'text'):", mixed4, "\n") cat("Type:", class(mixed4), "— everything became text!\n")

This is why one character value in a numeric vector turns everything into characters. It's the #1 type gotcha in R. When reading CSV files, a single "N/A" text entry in a column of numbers forces the entire column to character type.

How to debug coercion problems

# A common real-world problem data_column <- c(100, 200, "N/A", 400, 500) cat("Type:", class(data_column), "\n") cat("Values:", data_column, "\n") cat("Sum attempt:", sum(as.numeric(data_column)), "\n") # NA because of "N/A" # The fix: suppress the warning and handle NAs numeric_column <- suppressWarnings(as.numeric(data_column)) cat("\nConverted:", numeric_column, "\n") cat("Sum (ignoring NAs):", sum(numeric_column, na.rm = TRUE), "\n") cat("Mean (ignoring NAs):", mean(numeric_column, na.rm = TRUE), "\n")

The na.rm = TRUE argument tells R to ignore NA values when computing. You'll use this a lot.

Special Values: NA, NULL, NaN, Inf

R has four special values that aren't regular data types but show up constantly:

# NA — "Not Available" (missing data) cat("NA:", NA, "— type:", class(NA), "\n") cat("Is NA:", is.na(NA), "\n") cat("5 + NA:", 5 + NA, "\n") # Anything + NA = NA # NULL — "nothing" (empty, doesn't exist) cat("\nNULL:", NULL, "\n") cat("Length of NULL:", length(NULL), "\n") cat("Is NULL:", is.null(NULL), "\n") # NaN — "Not a Number" (undefined math result) cat("\n0/0:", 0/0, "\n") cat("Is NaN:", is.nan(0/0), "\n") # Inf — Infinity cat("\n1/0:", 1/0, "\n") cat("-1/0:", -1/0, "\n") cat("Is Inf:", is.infinite(1/0), "\n")

Value	Meaning	Created by	Check with
`NA`	Missing data	Missing CSV cells, failed conversion	`is.na()`
`NULL`	Empty/nothing	Empty function returns, deleted elements	`is.null()`
`NaN`	Not a number	`0/0`, `sqrt(-1)`	`is.nan()`
`Inf`	Infinity	`1/0`, `exp(1000)`	`is.infinite()`

Key difference: NA means "there's a value but we don't know it." NULL means "there's no value at all." This distinction matters when writing functions and handling missing data.

Practice Exercises

Exercise 1: Type Detective

Predict the type of each value before running the code:

# Exercise: Predict the type of each value, then run to check # What type is each of these? a <- 100 b <- 100L c <- "100" d <- TRUE e <- 3 + 0i f <- c(1, 2, "3") # Write your predictions as comments, then uncomment the cat() lines: # cat("a:", class(a), "\n") # Your prediction: ? # cat("b:", class(b), "\n") # Your prediction: ? # cat("c:", class(c), "\n") # Your prediction: ? # cat("d:", class(d), "\n") # Your prediction: ? # cat("e:", class(e), "\n") # Your prediction: ? # cat("f:", class(f), "\n") # Your prediction: ?

Click to reveal solution

# Solution a <- 100 # numeric (not integer — no L suffix) b <- 100L # integer (L suffix makes it integer) c <- "100" # character (quotes make it text) d <- TRUE # logical e <- 3 + 0i # complex (any use of i makes it complex) f <- c(1, 2, "3") # character (coercion: one string → all strings) cat("a:", class(a), "\n") # numeric cat("b:", class(b), "\n") # integer cat("c:", class(c), "\n") # character cat("d:", class(d), "\n") # logical cat("e:", class(e), "\n") # complex cat("f:", class(f), "\n") # character — the tricky one!

Explanation: f is the tricky one — even though 1 and 2 are numeric, the "3" is character, so R coerces everything to character: c("1", "2", "3").

Exercise 2: Fix the Type Bug

This code has a type error. Find and fix it:

# Exercise: This code should calculate the total price but has a bug prices <- c("19.99", "5.50", "12.00", "8.75") quantities <- c(2, 1, 3, 4) # This will fail — fix it! # total <- sum(prices * quantities) # cat("Total:", total, "\n") # Hint: Check what type 'prices' is. Then convert it. # Write your fix below:

Click to reveal solution

# Solution prices <- c("19.99", "5.50", "12.00", "8.75") quantities <- c(2, 1, 3, 4) # The bug: prices is character, not numeric cat("prices type:", class(prices), "\n") # Fix: convert prices to numeric first prices_num <- as.numeric(prices) total <- sum(prices_num * quantities) cat("Total: $", total, "\n") # Itemized: for (i in 1:length(prices_num)) { cat(sprintf(" $%.2f x %d = $%.2f\n", prices_num[i], quantities[i], prices_num[i] * quantities[i])) }

Explanation: prices is a character vector because of the quotes. You can't multiply text by numbers. as.numeric(prices) converts the text to numbers, and then the math works.

Exercise 3: Coercion Challenge

Predict what R will produce for each coercion scenario:

# Exercise: Predict the result and type for each # Then uncomment and run to check # 1. What happens when you add TRUE + TRUE + FALSE? # cat(TRUE + TRUE + FALSE, "\n") # 2. What type is c(1L, 2.5)? # cat(class(c(1L, 2.5)), "\n") # 3. What does as.integer(TRUE) return? # cat(as.integer(TRUE), "\n") # 4. What does as.logical(0) return? # cat(as.logical(0), "\n") # 5. What does as.logical("yes") return? # cat(as.logical("yes"), "\n") # Write your predictions, then run to verify:

Click to reveal solution

# Solution # 1. TRUE + TRUE + FALSE = 2 (TRUE=1, FALSE=0, so 1+1+0=2) cat("TRUE + TRUE + FALSE:", TRUE + TRUE + FALSE, "\n") # 2. c(1L, 2.5) → numeric (integer coerced to double) cat("c(1L, 2.5) type:", class(c(1L, 2.5)), "\n") # 3. as.integer(TRUE) = 1 cat("as.integer(TRUE):", as.integer(TRUE), "\n") # 4. as.logical(0) = FALSE (0 is FALSE, any nonzero is TRUE) cat("as.logical(0):", as.logical(0), "\n") # 5. as.logical("yes") = NA (R only recognizes "TRUE"/"FALSE" strings) cat("as.logical('yes'):", as.logical("yes"), "\n") cat("as.logical('TRUE'):", as.logical("TRUE"), "\n")

Explanation: #5 surprises many people — R can only convert the strings "TRUE" and "FALSE" (case-insensitive) to logical values. "yes", "no", "1", "0" as strings produce NA.

Summary

Type	Example	Check	Convert	Notes
numeric	`3.14`	`is.numeric()`	`as.numeric()`	Default for all numbers
integer	`42L`	`is.integer()`	`as.integer()`	Needs L suffix; truncates when converting
character	`"text"`	`is.character()`	`as.character()`	Quotes required; "wins" in coercion
logical	`TRUE`	`is.logical()`	`as.logical()`	TRUE=1, FALSE=0 in math
complex	`3+2i`	`is.complex()`	`as.complex()`	Rare; for imaginary numbers
raw	`charToRaw("A")`	`is.raw()`	`as.raw()`	Very rare; for byte data

Coercion hierarchy: logical → integer → numeric → complex → character

The #1 type gotcha: One character value in a numeric vector converts everything to character.

FAQ

What's the difference between class() and typeof()?

class() returns the high-level type (numeric, integer, character, etc.) — the one you use in daily work. typeof() returns the internal C-level storage type (double, integer, character, logical). For most purposes, use class().

Why does R say my numbers are "double"?

"Double" means "double-precision floating point" — it's the internal storage format for numeric values. class() reports "numeric" which is more user-friendly, but typeof() reports "double". They're the same thing.

How do I check the type of a data frame column?

Use class(df$column_name) or sapply(df, class) to check all columns at once. The str() function also shows types: str(df).

Can I change a column's type in a data frame?

Yes. Use df$column <- as.numeric(df$column) to convert a single column. For multiple columns, use dplyr::mutate(df, across(col1:col3, as.numeric)).

What's a factor? Is it a data type?

A factor is a special data structure (not an atomic type) for categorical data like "Male"/"Female" or "Low"/"Medium"/"High". Factors store integers internally but display as text labels. We cover factors in a later tutorial.

What's Next?

Now that you understand R's data types, you're ready to learn about data structures — how R organizes multiple values:

R Vectors — the most fundamental data structure, where all elements must be the same type
R Data Frames — tabular data with rows and columns (each column can be a different type)
R Lists — the most flexible structure, holding any combination of types

Each tutorial includes interactive code blocks for hands-on practice.

r-statistics.co by Selva Prabhakaran

R Data Types: Which Type Is Your Variable? (And Why It Matters)

Introduction

Numeric: The Default Number Type

Numeric precision

Integer: Whole Numbers with L

When does integer vs numeric matter?

Character: Text Strings

Useful character functions

Logical: TRUE and FALSE

Logical values as numbers

Complex: Imaginary Numbers

Raw: Bytes

Checking Types: class(), typeof(), is.*()

Converting Types: as.*()

Type Coercion: R's Automatic Conversions

How to debug coercion problems

Special Values: NA, NULL, NaN, Inf

Practice Exercises

Exercise 1: Type Detective

Exercise 2: Fix the Type Bug

Exercise 3: Coercion Challenge

Summary

FAQ

What's the difference between class() and typeof()?

Why does R say my numbers are "double"?

How do I check the type of a data frame column?

Can I change a column's type in a data frame?

What's a factor? Is it a data type?

What's Next?

On this page