r-statistics.co by Selva Prabhakaran


R's Four Special Values: NA, NULL, NaN, Inf — What Each One Actually Means

R has four special values that aren't regular data: NA (missing), NULL (nothing), NaN (undefined math), and Inf (infinity). They behave differently, and confusing them is one of the most common sources of R bugs.

Every R programmer hits this wall: you run mean(x) and get NA. Or a function returns NULL when you expected a number. Or a division produces NaN and silently corrupts your analysis. Understanding these four values — and how they differ — saves hours of debugging.

Introduction

Here's the quick version:

Value Meaning Example Analogy
NA Missing data Empty cell in a spreadsheet "I don't know the answer"
NULL Nothing exists Variable never created "There's no question"
NaN Not a Number 0/0 "The question makes no sense"
Inf Infinity 1/0 "The answer is infinitely large"

Now let's explore each one in depth.

NA: Missing Data

NA is by far the most common special value. It means "this value exists but is unknown" — like an empty cell in a spreadsheet. A survey respondent who skipped a question has an NA for that field.

# NA in action x <- c(10, 20, NA, 40, 50) cat("Vector:", x, "\n") cat("Length:", length(x), "\n") # NA counts as an element! # The NA trap: any operation with NA produces NA cat("Sum:", sum(x), "\n") # NA! cat("Mean:", mean(x), "\n") # NA! # The fix: na.rm = TRUE cat("Sum (na.rm):", sum(x, na.rm = TRUE), "\n") # 120 cat("Mean (na.rm):", mean(x, na.rm = TRUE), "\n") # 30

  

Why NA is contagious

NA means "unknown." If you add 5 to an unknown value, the result is still unknown — so 5 + NA is NA. This makes logical sense but catches beginners off guard:

# NA is contagious — it "infects" any calculation cat("5 + NA:", 5 + NA, "\n") cat("NA > 3:", NA > 3, "\n") cat("NA == NA:", NA == NA, "\n") # Even this is NA! # Why is NA == NA not TRUE? # Think: "Is the unknown value equal to the unknown value?" # We don't know — both are unknown! So the answer is: unknown (NA)

  

Detecting NA

x <- c(10, NA, 30, NA, 50) # is.na() — the correct way to check for NA cat("is.na:", is.na(x), "\n") # WRONG: x == NA doesn't work (always returns NA) cat("x == NA:", x == NA, "\n") # All NA! # Count and locate NAs cat("NA count:", sum(is.na(x)), "\n") cat("NA positions:", which(is.na(x)), "\n") cat("Non-NA count:", sum(!is.na(x)), "\n")

  

Critical rule: Never use x == NA to check for NA. It always returns NA. Always use is.na(x).

Handling NA in data frames

# Real-world scenario: survey data with missing values survey <- data.frame( name = c("Alice", "Bob", "Carol", "David", "Eve"), age = c(25, NA, 42, 31, NA), score = c(88, 72, NA, 95, 81) ) cat("Missing values per column:\n") print(colSums(is.na(survey))) # Complete cases — rows with no NAs complete <- survey[complete.cases(survey), ] cat("\nComplete cases:\n") print(complete) # Replace NAs with a value (imputation) survey$age[is.na(survey$age)] <- median(survey$age, na.rm = TRUE) cat("\nAfter imputing median age:\n") print(survey)

  

Typed NAs

NA has different types for different vector types:

# Default NA is logical cat("class(NA):", class(NA), "\n") # Typed NAs for specific vector types cat("class(NA_real_):", class(NA_real_), "\n") # numeric NA cat("class(NA_integer_):", class(NA_integer_), "\n") # integer NA cat("class(NA_character_):", class(NA_character_), "\n") # character NA cat("class(NA_complex_):", class(NA_complex_), "\n") # complex NA # Why it matters: creating properly typed empty vectors nums <- c(1, 2, NA) # NA coerced to numeric — fine cat("nums type:", class(nums), "\n")

  

You rarely need typed NAs in daily work, but they matter when pre-allocating vectors or working with databases.

NULL: Nothing Exists

NULL represents the absence of a value entirely — not a missing value, but no value at all. It's like an empty slot that doesn't exist, versus NA which is a slot that exists but is blank.

# NULL vs NA x <- NULL y <- NA cat("Length of NULL:", length(x), "\n") # 0 — nothing there cat("Length of NA:", length(y), "\n") # 1 — something there (just unknown) # NULL disappears in vectors v1 <- c(1, NULL, 3) # NULL vanishes v2 <- c(1, NA, 3) # NA stays as an element cat("With NULL:", v1, "— length:", length(v1), "\n") cat("With NA:", v2, "— length:", length(v2), "\n")

  

When does NULL appear?

# 1. Functions that return nothing result <- cat("hello\n") # cat() returns NULL invisibly cat("cat() returned:", is.null(result), "\n") # 2. Accessing non-existent list elements my_list <- list(a = 1, b = 2) cat("Existing element:", my_list$a, "\n") cat("Non-existent:", is.null(my_list$c), "\n") # TRUE — $c doesn't exist # 3. Removing list elements my_list$a <- NULL cat("After removal, names:", names(my_list), "\n") # 4. Default argument meaning "not provided" my_func <- function(x, label = NULL) { if (is.null(label)) { label <- deparse(substitute(x)) # Auto-generate from variable name } cat("Label:", label, "\n") } my_func(42) # Uses auto-generated label my_func(42, label = "Answer") # Uses provided label

  

Detecting NULL

x <- NULL # is.null() — the correct way cat("is.null(NULL):", is.null(x), "\n") # Note: is.na(NULL) doesn't work as expected cat("is.na(NULL):", is.na(x), "\n") # logical(0), not TRUE! cat("length(is.na(NULL)):", length(is.na(NULL)), "\n") # NULL in if statements if (is.null(x)) { cat("x is NULL — no data available\n") }

  

NaN: Not a Number

NaN (Not a Number) results from undefined mathematical operations. It's R's way of saying "this calculation doesn't have a numerical answer."

# Operations that produce NaN cat("0/0:", 0/0, "\n") # Division of zero by zero cat("Inf - Inf:", Inf - Inf, "\n") # Infinity minus infinity cat("0 * Inf:", 0 * Inf, "\n") # Zero times infinity # NaN is technically also NA! cat("\nis.nan(NaN):", is.nan(NaN), "\n") cat("is.na(NaN):", is.na(NaN), "\n") # TRUE — NaN is a type of NA # But NA is NOT NaN cat("is.nan(NA):", is.nan(NA), "\n") # FALSE — NA is not NaN

  

The relationship: all NaN values are NA, but not all NA values are NaN. NaN is a specific kind of "missing" — missing because the math was undefined.

# Detecting NaN specifically x <- c(1, NaN, NA, 4, 0/0) cat("Values:", x, "\n") cat("is.na:", is.na(x), "\n") # TRUE for both NA and NaN cat("is.nan:", is.nan(x), "\n") # TRUE only for NaN # Find NaN positions cat("NaN positions:", which(is.nan(x)), "\n") # In practice, you rarely need to distinguish NaN from NA # Just use is.na() and na.rm = TRUE cat("Mean (ignoring NaN and NA):", mean(x, na.rm = TRUE), "\n")

  

Inf: Infinity

Inf (and -Inf) represent positive and negative infinity. They're actual numeric values that R can compute with:

# Operations that produce Inf cat("1/0:", 1/0, "\n") # Positive infinity cat("-1/0:", -1/0, "\n") # Negative infinity cat("exp(1000):", exp(1000), "\n") # Overflow to Inf # Inf is a real numeric value — you can do math with it cat("\nInf + 100:", Inf + 100, "\n") # Still Inf cat("Inf * -1:", Inf * -1, "\n") # -Inf cat("1/Inf:", 1/Inf, "\n") # 0 (approaches zero) cat("Inf > 1000000:", Inf > 1000000, "\n") # TRUE — Inf is bigger than anything # Inf in comparisons cat("max(c(1, Inf, 3)):", max(c(1, Inf, 3)), "\n") # Inf cat("min(c(1, -Inf, 3)):", min(c(1, -Inf, 3)), "\n") # -Inf

  

Detecting Inf

x <- c(1, Inf, -Inf, 4, NA, NaN) cat("Values:", x, "\n") cat("is.infinite:", is.infinite(x), "\n") cat("is.finite:", is.finite(x), "\n") # TRUE only for regular numbers cat("is.na:", is.na(x), "\n") # Inf is NOT NA! # Inf is not NA — this surprises people cat("\nis.na(Inf):", is.na(Inf), "\n") # FALSE cat("is.na(NaN):", is.na(NaN), "\n") # TRUE

  

When Inf shows up in practice

# Common source: log of zero cat("log(0):", log(0), "\n") # -Inf # Division where denominator approaches zero small <- 1e-308 cat("1/small:", 1/small, "\n") # Very large but finite smaller <- 1e-309 cat("1/smaller:", 1/smaller, "\n") # Inf (overflow) # Practical example: percentage change when baseline is zero baseline <- c(100, 0, 50, 0, 80) current <- c(120, 30, 50, 0, 100) pct_change <- (current - baseline) / baseline * 100 cat("Pct change:", pct_change, "\n") cat("Has Inf:", any(is.infinite(pct_change)), "\n") # Fix: replace Inf with NA pct_change[is.infinite(pct_change)] <- NA cat("Fixed:", pct_change, "\n") cat("Mean change:", mean(pct_change, na.rm = TRUE), "%\n")

  

The Complete Comparison

Here's how all four special values compare across common operations:

# Comparison table vals <- list(NA = NA, NULL = NULL, NaN = NaN, Inf = Inf) cat("=== Type checks ===\n") cat("is.na: NA=", is.na(NA), " NaN=", is.na(NaN), " Inf=", is.na(Inf), "\n") cat("is.null: NA=", is.null(NA), " NULL=", is.null(NULL), "\n") cat("is.nan: NA=", is.nan(NA), " NaN=", is.nan(NaN), "\n") cat("is.infinite: Inf=", is.infinite(Inf), " NA=", is.infinite(NA), "\n") cat("is.finite: 42=", is.finite(42), " NA=", is.finite(NA), " NaN=", is.finite(NaN), " Inf=", is.finite(Inf), "\n") cat("\n=== In vectors ===\n") cat("c(1,NA,3):", c(1, NA, 3), "length:", length(c(1, NA, 3)), "\n") cat("c(1,NULL,3):", c(1, NULL, 3), "length:", length(c(1, NULL, 3)), "\n") cat("c(1,NaN,3):", c(1, NaN, 3), "length:", length(c(1, NaN, 3)), "\n") cat("c(1,Inf,3):", c(1, Inf, 3), "length:", length(c(1, Inf, 3)), "\n")

  
Property NA NULL NaN Inf
Has a type? Yes (logical) No Yes (numeric) Yes (numeric)
Has a length? 1 0 1 1
Survives in c()? Yes No (disappears) Yes Yes
is.na() returns TRUE? Yes No Yes No
Can do math? No (returns NA) Error Returns NaN Yes (returns Inf)
na.rm removes it? Yes N/A Yes No

Cleaning Data: A Real Workflow

Let's put it all together with a realistic data cleaning scenario:

# Messy data with all four special values sales <- data.frame( product = c("Widget", "Gadget", "Doohickey", "Thingamajig", "Whatsit"), price = c(25, 0, NA, 15, 50), quantity = c(100, 50, 75, 0, NA), baseline = c(80, 0, 60, 0, 90) ) cat("Original data:\n") print(sales) # Calculate revenue and growth — watch the special values appear sales$revenue <- sales$price * sales$quantity sales$growth <- (sales$revenue - sales$baseline) / sales$baseline * 100 cat("\nWith calculations (notice Inf and NaN):\n") print(sales) # Clean up: replace Inf and NaN with NA, then compute sales$growth[is.infinite(sales$growth) | is.nan(sales$growth)] <- NA cat("\nCleaned data:\n") print(sales) cat("\nAverage growth (valid products):", round(mean(sales$growth, na.rm = TRUE), 1), "%\n") cat("Products with valid data:", sum(!is.na(sales$growth)), "of", nrow(sales), "\n")

  

Practice Exercises

Exercise 1: NA Detective

# Exercise: Given this data, find and fix all the problems: patient_data <- data.frame( id = 1:8, age = c(25, NA, 42, 31, NA, 55, -1, 28), weight_kg = c(70, 85, NA, 68, 90, 0, 75, 82), height_m = c(1.75, 1.80, 1.65, 0, 1.72, 1.68, 1.85, NA) ) # 1. Report how many NAs per column # 2. Flag invalid values: age < 0, weight = 0, height = 0 # 3. Replace all invalid values with NA # 4. Calculate BMI (weight/height^2) — what special values appear? # 5. Report clean summary statistics # Write your code below:

  
Click to reveal solution
# Solution patient_data <- data.frame( id = 1:8, age = c(25, NA, 42, 31, NA, 55, -1, 28), weight_kg = c(70, 85, NA, 68, 90, 0, 75, 82), height_m = c(1.75, 1.80, 1.65, 0, 1.72, 1.68, 1.85, NA) ) # 1. NAs per column cat("Missing values:\n") print(colSums(is.na(patient_data))) # 2 & 3. Flag and replace invalid values patient_data$age[patient_data$age < 0] <- NA patient_data$weight_kg[!is.na(patient_data$weight_kg) & patient_data$weight_kg == 0] <- NA patient_data$height_m[!is.na(patient_data$height_m) & patient_data$height_m == 0] <- NA # 4. Calculate BMI patient_data$bmi <- round(patient_data$weight_kg / patient_data$height_m^2, 1) cat("\nCleaned data:\n") print(patient_data) # 5. Summary cat("\nSummary (valid patients only):\n") cat("Valid BMIs:", sum(!is.na(patient_data$bmi)), "of", nrow(patient_data), "\n") cat("Mean BMI:", round(mean(patient_data$bmi, na.rm = TRUE), 1), "\n") cat("Mean age:", round(mean(patient_data$age, na.rm = TRUE), 1), "\n")

  

Explanation: Invalid values (negative age, zero weight/height) are first converted to NA so they don't corrupt calculations. BMI with missing inputs naturally produces NA. The na.rm = TRUE pattern handles all the missing data cleanly.

Exercise 2: Safe Division Function

# Exercise: Write a function safe_divide(a, b) that: # - Returns a/b for normal values # - Returns NA (not Inf) when b is 0 # - Returns NA when either a or b is NA # - Works on vectors (not just single values) # Test: safe_divide(c(10, 20, 30, NA), c(2, 0, 5, 3)) # Write your code below:

  
Click to reveal solution
# Solution safe_divide <- function(a, b) { result <- a / b result[is.infinite(result)] <- NA return(result) } # Test a <- c(10, 20, 30, NA, 0) b <- c(2, 0, 5, 3, 0) cat("a:", a, "\n") cat("b:", b, "\n") cat("a/b:", a/b, "\n") cat("safe_divide:", safe_divide(a, b), "\n")

  

Explanation: Rather than checking for zero denominators before dividing (which requires careful NA handling), we let R divide normally and then replace any Inf or -Inf results with NA. NaN from 0/0 is already treated as NA by most functions. This approach is simpler and handles edge cases automatically.

Exercise 3: Complete Data Report

# Exercise: Write a function data_quality() that takes a data frame and # prints a quality report for each column: # - Column name, type, total values # - Count/percentage of: NA, NaN, Inf, -Inf, zero values # Test with a messy data frame you create # Write your code below:

  
Click to reveal solution
# Solution data_quality <- function(df) { cat(sprintf("Data Quality Report: %d rows x %d columns\n\n", nrow(df), ncol(df))) for (col in names(df)) { x <- df[[col]] n <- length(x) na_count <- sum(is.na(x)) cat(sprintf("%-12s [%s] n=%d", col, class(x), n)) if (is.numeric(x)) { nan_count <- sum(is.nan(x)) inf_count <- sum(is.infinite(x)) zero_count <- sum(x == 0, na.rm = TRUE) cat(sprintf(" NA=%d(%.0f%%) NaN=%d Inf=%d Zero=%d", na_count, na_count/n*100, nan_count, inf_count, zero_count)) } else { cat(sprintf(" NA=%d(%.0f%%)", na_count, na_count/n*100)) } cat("\n") } } # Test with messy data messy <- data.frame( revenue = c(100, 0, NA, Inf, 50, NaN, -20, 0), category = c("A", "B", NA, "A", "B", "C", NA, "A"), ratio = c(0.5, 0/0, 1.2, 1/0, NA, 0.8, -1/0, 0) ) data_quality(messy)

  

Explanation: The function iterates over columns, checks each one's type, and reports different quality metrics for numeric vs non-numeric columns. This is the kind of function you'd keep in a personal utility script and use at the start of every data analysis.

Summary

Value What it means Detect with Handle with
NA Missing/unknown is.na() na.rm = TRUE, na.omit(), complete.cases()
NULL Nothing exists is.null() Check before use, provide defaults
NaN Undefined math is.nan() Treated as NA by most functions
Inf/-Inf Infinity is.infinite() Replace with NA or cap at a maximum

The golden rules:

  1. Never use x == NA — always use is.na(x)
  2. Use na.rm = TRUE in statistical functions
  3. Check for NULL before accessing list elements
  4. Replace Inf values before computing means or medians

FAQ

Why does mean() return NA when there's one missing value?

By design. If one value is unknown, the true mean is also unknown. R forces you to explicitly choose how to handle missing data with na.rm = TRUE rather than silently ignoring NAs, which could mask data quality problems.

What's the difference between NA and NaN?

NA means "value exists but is unknown" — like a blank survey answer. NaN means "the math you tried is undefined" — like 0/0. In practice, both are handled the same way (na.rm = TRUE), but NaN gives you a clue about why the value is missing.

When should I use NULL vs NA?

Use NA for missing data in vectors and data frames. Use NULL for "this thing doesn't exist" — like a function argument that wasn't provided, a list element that should be removed, or a function that doesn't return anything.

Does na.rm = TRUE remove Inf values?

No! na.rm only removes NA and NaN. Inf is a valid numeric value. To exclude Inf, either filter it out first (x[is.finite(x)]) or replace it with NA (x[is.infinite(x)] <- NA).

How do I replace all special values at once?

Use is.finite() — it returns TRUE only for regular, non-special numbers: x[!is.finite(x)] <- NA replaces NA, NaN, Inf, and -Inf all at once.

What's Next?

With special values mastered, you're ready for the final fundamentals topic:

  1. Getting Help in R — navigate R's documentation system efficiently
  2. Further Reading: Copy-on-Modify — understand how R handles memory
  3. R Matrices — when you need uniform numeric data structures

Understanding special values is essential for every tutorial that follows — real data always has missing values.