R's Four Special Values: NA, NULL, NaN, Inf — What Each One Actually Means
R has four special values that aren't regular data: NA (missing), NULL (nothing), NaN (undefined math), and Inf (infinity). They behave differently, and confusing them is one of the most common sources of R bugs.
Every R programmer hits this wall: you run mean(x) and get NA. Or a function returns NULL when you expected a number. Or a division produces NaN and silently corrupts your analysis. Understanding these four values — and how they differ — saves hours of debugging.
Introduction
Here's the quick version:
Value
Meaning
Example
Analogy
NA
Missing data
Empty cell in a spreadsheet
"I don't know the answer"
NULL
Nothing exists
Variable never created
"There's no question"
NaN
Not a Number
0/0
"The question makes no sense"
Inf
Infinity
1/0
"The answer is infinitely large"
Now let's explore each one in depth.
NA: Missing Data
NA is by far the most common special value. It means "this value exists but is unknown" — like an empty cell in a spreadsheet. A survey respondent who skipped a question has an NA for that field.
# NA in action
x <- c(10, 20, NA, 40, 50)
cat("Vector:", x, "\n")
cat("Length:", length(x), "\n") # NA counts as an element!
# The NA trap: any operation with NA produces NA
cat("Sum:", sum(x), "\n") # NA!
cat("Mean:", mean(x), "\n") # NA!
# The fix: na.rm = TRUE
cat("Sum (na.rm):", sum(x, na.rm = TRUE), "\n") # 120
cat("Mean (na.rm):", mean(x, na.rm = TRUE), "\n") # 30
Why NA is contagious
NA means "unknown." If you add 5 to an unknown value, the result is still unknown — so 5 + NA is NA. This makes logical sense but catches beginners off guard:
# NA is contagious — it "infects" any calculation
cat("5 + NA:", 5 + NA, "\n")
cat("NA > 3:", NA > 3, "\n")
cat("NA == NA:", NA == NA, "\n") # Even this is NA!
# Why is NA == NA not TRUE?
# Think: "Is the unknown value equal to the unknown value?"
# We don't know — both are unknown! So the answer is: unknown (NA)
Detecting NA
x <- c(10, NA, 30, NA, 50)
# is.na() — the correct way to check for NA
cat("is.na:", is.na(x), "\n")
# WRONG: x == NA doesn't work (always returns NA)
cat("x == NA:", x == NA, "\n") # All NA!
# Count and locate NAs
cat("NA count:", sum(is.na(x)), "\n")
cat("NA positions:", which(is.na(x)), "\n")
cat("Non-NA count:", sum(!is.na(x)), "\n")
Critical rule: Never use x == NA to check for NA. It always returns NA. Always use is.na(x).
Handling NA in data frames
# Real-world scenario: survey data with missing values
survey <- data.frame(
name = c("Alice", "Bob", "Carol", "David", "Eve"),
age = c(25, NA, 42, 31, NA),
score = c(88, 72, NA, 95, 81)
)
cat("Missing values per column:\n")
print(colSums(is.na(survey)))
# Complete cases — rows with no NAs
complete <- survey[complete.cases(survey), ]
cat("\nComplete cases:\n")
print(complete)
# Replace NAs with a value (imputation)
survey$age[is.na(survey$age)] <- median(survey$age, na.rm = TRUE)
cat("\nAfter imputing median age:\n")
print(survey)
Typed NAs
NA has different types for different vector types:
# Default NA is logical
cat("class(NA):", class(NA), "\n")
# Typed NAs for specific vector types
cat("class(NA_real_):", class(NA_real_), "\n") # numeric NA
cat("class(NA_integer_):", class(NA_integer_), "\n") # integer NA
cat("class(NA_character_):", class(NA_character_), "\n") # character NA
cat("class(NA_complex_):", class(NA_complex_), "\n") # complex NA
# Why it matters: creating properly typed empty vectors
nums <- c(1, 2, NA) # NA coerced to numeric — fine
cat("nums type:", class(nums), "\n")
You rarely need typed NAs in daily work, but they matter when pre-allocating vectors or working with databases.
NULL: Nothing Exists
NULL represents the absence of a value entirely — not a missing value, but no value at all. It's like an empty slot that doesn't exist, versus NA which is a slot that exists but is blank.
# NULL vs NA
x <- NULL
y <- NA
cat("Length of NULL:", length(x), "\n") # 0 — nothing there
cat("Length of NA:", length(y), "\n") # 1 — something there (just unknown)
# NULL disappears in vectors
v1 <- c(1, NULL, 3) # NULL vanishes
v2 <- c(1, NA, 3) # NA stays as an element
cat("With NULL:", v1, "— length:", length(v1), "\n")
cat("With NA:", v2, "— length:", length(v2), "\n")
When does NULL appear?
# 1. Functions that return nothing
result <- cat("hello\n") # cat() returns NULL invisibly
cat("cat() returned:", is.null(result), "\n")
# 2. Accessing non-existent list elements
my_list <- list(a = 1, b = 2)
cat("Existing element:", my_list$a, "\n")
cat("Non-existent:", is.null(my_list$c), "\n") # TRUE — $c doesn't exist
# 3. Removing list elements
my_list$a <- NULL
cat("After removal, names:", names(my_list), "\n")
# 4. Default argument meaning "not provided"
my_func <- function(x, label = NULL) {
if (is.null(label)) {
label <- deparse(substitute(x)) # Auto-generate from variable name
}
cat("Label:", label, "\n")
}
my_func(42) # Uses auto-generated label
my_func(42, label = "Answer") # Uses provided label
Detecting NULL
x <- NULL
# is.null() — the correct way
cat("is.null(NULL):", is.null(x), "\n")
# Note: is.na(NULL) doesn't work as expected
cat("is.na(NULL):", is.na(x), "\n") # logical(0), not TRUE!
cat("length(is.na(NULL)):", length(is.na(NULL)), "\n")
# NULL in if statements
if (is.null(x)) {
cat("x is NULL — no data available\n")
}
NaN: Not a Number
NaN (Not a Number) results from undefined mathematical operations. It's R's way of saying "this calculation doesn't have a numerical answer."
# Operations that produce NaN
cat("0/0:", 0/0, "\n") # Division of zero by zero
cat("Inf - Inf:", Inf - Inf, "\n") # Infinity minus infinity
cat("0 * Inf:", 0 * Inf, "\n") # Zero times infinity
# NaN is technically also NA!
cat("\nis.nan(NaN):", is.nan(NaN), "\n")
cat("is.na(NaN):", is.na(NaN), "\n") # TRUE — NaN is a type of NA
# But NA is NOT NaN
cat("is.nan(NA):", is.nan(NA), "\n") # FALSE — NA is not NaN
The relationship: all NaN values are NA, but not all NA values are NaN. NaN is a specific kind of "missing" — missing because the math was undefined.
# Detecting NaN specifically
x <- c(1, NaN, NA, 4, 0/0)
cat("Values:", x, "\n")
cat("is.na:", is.na(x), "\n") # TRUE for both NA and NaN
cat("is.nan:", is.nan(x), "\n") # TRUE only for NaN
# Find NaN positions
cat("NaN positions:", which(is.nan(x)), "\n")
# In practice, you rarely need to distinguish NaN from NA
# Just use is.na() and na.rm = TRUE
cat("Mean (ignoring NaN and NA):", mean(x, na.rm = TRUE), "\n")
Inf: Infinity
Inf (and -Inf) represent positive and negative infinity. They're actual numeric values that R can compute with:
# Operations that produce Inf
cat("1/0:", 1/0, "\n") # Positive infinity
cat("-1/0:", -1/0, "\n") # Negative infinity
cat("exp(1000):", exp(1000), "\n") # Overflow to Inf
# Inf is a real numeric value — you can do math with it
cat("\nInf + 100:", Inf + 100, "\n") # Still Inf
cat("Inf * -1:", Inf * -1, "\n") # -Inf
cat("1/Inf:", 1/Inf, "\n") # 0 (approaches zero)
cat("Inf > 1000000:", Inf > 1000000, "\n") # TRUE — Inf is bigger than anything
# Inf in comparisons
cat("max(c(1, Inf, 3)):", max(c(1, Inf, 3)), "\n") # Inf
cat("min(c(1, -Inf, 3)):", min(c(1, -Inf, 3)), "\n") # -Inf
Detecting Inf
x <- c(1, Inf, -Inf, 4, NA, NaN)
cat("Values:", x, "\n")
cat("is.infinite:", is.infinite(x), "\n")
cat("is.finite:", is.finite(x), "\n") # TRUE only for regular numbers
cat("is.na:", is.na(x), "\n") # Inf is NOT NA!
# Inf is not NA — this surprises people
cat("\nis.na(Inf):", is.na(Inf), "\n") # FALSE
cat("is.na(NaN):", is.na(NaN), "\n") # TRUE
When Inf shows up in practice
# Common source: log of zero
cat("log(0):", log(0), "\n") # -Inf
# Division where denominator approaches zero
small <- 1e-308
cat("1/small:", 1/small, "\n") # Very large but finite
smaller <- 1e-309
cat("1/smaller:", 1/smaller, "\n") # Inf (overflow)
# Practical example: percentage change when baseline is zero
baseline <- c(100, 0, 50, 0, 80)
current <- c(120, 30, 50, 0, 100)
pct_change <- (current - baseline) / baseline * 100
cat("Pct change:", pct_change, "\n")
cat("Has Inf:", any(is.infinite(pct_change)), "\n")
# Fix: replace Inf with NA
pct_change[is.infinite(pct_change)] <- NA
cat("Fixed:", pct_change, "\n")
cat("Mean change:", mean(pct_change, na.rm = TRUE), "%\n")
The Complete Comparison
Here's how all four special values compare across common operations:
Explanation: Invalid values (negative age, zero weight/height) are first converted to NA so they don't corrupt calculations. BMI with missing inputs naturally produces NA. The na.rm = TRUE pattern handles all the missing data cleanly.
Exercise 2: Safe Division Function
# Exercise: Write a function safe_divide(a, b) that:
# - Returns a/b for normal values
# - Returns NA (not Inf) when b is 0
# - Returns NA when either a or b is NA
# - Works on vectors (not just single values)
# Test: safe_divide(c(10, 20, 30, NA), c(2, 0, 5, 3))
# Write your code below:
Click to reveal solution
# Solution
safe_divide <- function(a, b) {
result <- a / b
result[is.infinite(result)] <- NA
return(result)
}
# Test
a <- c(10, 20, 30, NA, 0)
b <- c(2, 0, 5, 3, 0)
cat("a:", a, "\n")
cat("b:", b, "\n")
cat("a/b:", a/b, "\n")
cat("safe_divide:", safe_divide(a, b), "\n")
Explanation: Rather than checking for zero denominators before dividing (which requires careful NA handling), we let R divide normally and then replace any Inf or -Inf results with NA. NaN from 0/0 is already treated as NA by most functions. This approach is simpler and handles edge cases automatically.
Exercise 3: Complete Data Report
# Exercise: Write a function data_quality() that takes a data frame and
# prints a quality report for each column:
# - Column name, type, total values
# - Count/percentage of: NA, NaN, Inf, -Inf, zero values
# Test with a messy data frame you create
# Write your code below:
Explanation: The function iterates over columns, checks each one's type, and reports different quality metrics for numeric vs non-numeric columns. This is the kind of function you'd keep in a personal utility script and use at the start of every data analysis.
Summary
Value
What it means
Detect with
Handle with
NA
Missing/unknown
is.na()
na.rm = TRUE, na.omit(), complete.cases()
NULL
Nothing exists
is.null()
Check before use, provide defaults
NaN
Undefined math
is.nan()
Treated as NA by most functions
Inf/-Inf
Infinity
is.infinite()
Replace with NA or cap at a maximum
The golden rules:
Never use x == NA — always use is.na(x)
Use na.rm = TRUE in statistical functions
Check for NULL before accessing list elements
Replace Inf values before computing means or medians
FAQ
Why does mean() return NA when there's one missing value?
By design. If one value is unknown, the true mean is also unknown. R forces you to explicitly choose how to handle missing data with na.rm = TRUE rather than silently ignoring NAs, which could mask data quality problems.
What's the difference between NA and NaN?
NA means "value exists but is unknown" — like a blank survey answer. NaN means "the math you tried is undefined" — like 0/0. In practice, both are handled the same way (na.rm = TRUE), but NaN gives you a clue about why the value is missing.
When should I use NULL vs NA?
Use NA for missing data in vectors and data frames. Use NULL for "this thing doesn't exist" — like a function argument that wasn't provided, a list element that should be removed, or a function that doesn't return anything.
Does na.rm = TRUE remove Inf values?
No! na.rm only removes NA and NaN. Inf is a valid numeric value. To exclude Inf, either filter it out first (x[is.finite(x)]) or replace it with NA (x[is.infinite(x)] <- NA).
How do I replace all special values at once?
Use is.finite() — it returns TRUE only for regular, non-special numbers: x[!is.finite(x)] <- NA replaces NA, NaN, Inf, and -Inf all at once.
What's Next?
With special values mastered, you're ready for the final fundamentals topic:
Getting Help in R — navigate R's documentation system efficiently
Further Reading: Copy-on-Modify — understand how R handles memory
R Matrices — when you need uniform numeric data structures
Understanding special values is essential for every tutorial that follows — real data always has missing values.