R Error: argument is not a matrix in apply() — Solutions & Alternatives

Error in apply(X, MARGIN, FUN) : argument 'X' is not a matrix means you passed a vector, list, or other non-matrix object to apply(). The fix depends on whether you need apply() at all or should use lapply/sapply instead.

The Error

# Reproduce the error: my_vec <- c(1, 2, 3, 4, 5) # apply() expects a matrix or data frame, not a vector # apply(my_vec, 1, sum) # Error: argument 'X' is not a matrix cat("class of my_vec:", class(my_vec), "\n") cat("apply() needs a 2D structure (matrix or data frame).\n")


  

Cause 1: Passing a Vector Instead of a Matrix

apply() is designed for 2D structures. For vectors, use sapply() or vapply():

scores <- c(88, 92, 75, 95, 81) # Wrong: apply on a vector # apply(scores, 1, function(x) x * 1.1) # Fix: for a vector, just use vectorized operations boosted <- scores * 1.1 cat("Boosted scores:", boosted, "\n") # Or if you need to apply a function to each element: result <- sapply(scores, function(x) if (x >= 90) "A" else "B") cat("Grades:", result, "\n")


  

Fix: Use vectorized operations, sapply(), or vapply() for vectors. Save apply() for matrices and data frames.

Cause 2: Data Frame with Mixed Types

apply() works on data frames but coerces everything to the same type. If your data frame has mixed types (numeric + character), the result may surprise you:

df <- data.frame( name = c("Alice", "Bob"), score = c(88, 92), grade = c("A", "A") ) # apply() coerces to character because 'name' is character result <- apply(df, 1, paste, collapse = " - ") cat("Row-wise paste:", result, "\n") # For numeric operations on mixed data frames, select numeric columns first nums <- df[, sapply(df, is.numeric)] cat("\nNumeric columns only:\n") print(nums) row_means <- apply(as.matrix(nums), 1, mean) cat("Row means:", row_means, "\n")


  

Fix: Select only numeric columns with df[, sapply(df, is.numeric)] before using apply() for calculations.

Cause 3: Single-Column Data Frame

Subsetting a data frame with [, j] can drop to a vector if there's only one column:

df <- data.frame(a = 1:5, b = 6:10, c = 11:15) # Single column extraction drops to vector one_col <- df[, "a"] cat("Class of df[,'a']:", class(one_col), "\n") # Fix: use drop = FALSE to keep it as a data frame one_col_df <- df[, "a", drop = FALSE] cat("Class with drop=FALSE:", class(one_col_df), "\n") # Now apply works result <- apply(one_col_df, 1, sum) cat("apply result:", result, "\n")


  

Fix: Use df[, col, drop = FALSE] to prevent single-column data frames from becoming vectors.

Cause 4: List Instead of Matrix

Lists need lapply() or sapply(), not apply():

my_list <- list(a = 1:3, b = 4:6, c = 7:9) # Wrong: apply on a list # apply(my_list, 1, sum) # Error # Fix: use lapply for lists result <- lapply(my_list, sum) cat("lapply result:\n") print(result) # Or sapply for simplified output result2 <- sapply(my_list, sum) cat("\nsapply result:", result2, "\n")


  

Fix: Use lapply() for lists (returns list), or sapply() (returns simplified vector/matrix).

Cause 5: Object Became NULL or Empty

After filtering or a failed operation, your object might be NULL or have zero dimensions:

df <- data.frame(x = 1:5, y = 6:10) # Subsetting that returns nothing empty <- df[df$x > 100, ] cat("Rows:", nrow(empty), "\n") cat("Class:", class(empty), "\n") # apply on empty data frame won't error, but check your logic result <- apply(as.matrix(empty), 2, mean) cat("Result on empty:", result, "\n") # NULL will error cat("\nAlways check for NULL and empty objects before apply().\n") cat("Use is.null(x) and nrow(x) > 0 as guards.\n")


  

Fix: Check is.null(x), nrow(x) > 0, and class(x) before calling apply().

When to Use Which Apply Function

# Quick reference: cat("apply(X, MARGIN, FUN) - matrices/data frames, row/column-wise\n") cat("lapply(X, FUN) - lists/vectors, returns a list\n") cat("sapply(X, FUN) - lists/vectors, returns simplified result\n") cat("vapply(X, FUN, type) - like sapply but with type safety\n") cat("tapply(X, INDEX, FUN) - grouped operations on a vector\n") cat("mapply(FUN, ...) - multiple inputs in parallel\n") # Examples mat <- matrix(1:12, nrow = 3) cat("\nMatrix:\n") print(mat) cat("Column sums (apply):", apply(mat, 2, sum), "\n") cat("Built-in is faster: ", colSums(mat), "\n")


  

Practice Exercise

# Exercise: You have this data and need the row-wise maximum. # apply() won't work directly. Fix it. data <- list( row1 = c(5, 8, 3), row2 = c(9, 2, 7), row3 = c(1, 6, 4) ) # Write code to get the maximum value in each row:


  
Click to reveal solution ```r
data <- list( row1 = c(5, 8, 3), row2 = c(9, 2, 7), row3 = c(1, 6, 4) ) # Solution 1: sapply on the list row_max <- sapply(data, max) cat("Row maxima (sapply):", row_max, "\n") # Solution 2: convert to matrix first, then use apply mat <- do.call(rbind, data) cat("\nAs matrix:\n") print(mat) row_max2 <- apply(mat, 1, max) cat("Row maxima (apply):", row_max2, "\n") # Solution 3: use pmax for parallel maximum (most efficient) cat("Built-in pmax:", pmax(mat[,1], mat[,2], mat[,3]), "\n")

  
**Explanation:** The data is a list, so `apply()` fails directly. Three solutions: (1) use `sapply()` which works on lists, (2) convert to matrix with `do.call(rbind, ...)` then use `apply()`, (3) use the vectorized `pmax()` function.

Summary

Cause Fix Prevention
Vector instead of matrix Use sapply() or vectorized ops Check class(x)
Mixed-type data frame Select numeric columns first Separate numeric and character cols
Single column drops to vector Use drop = FALSE Always use drop = FALSE when subsetting
List object Use lapply() or sapply() Match apply function to data structure
NULL or empty object Check before calling apply Add guard conditions

FAQ

Should I use apply() on data frames?

It works but is often not ideal. apply() coerces data frames to matrices, which makes all columns the same type. For data frames, prefer lapply() (column-wise), sapply(), or dplyr::across(). Use colMeans(), rowSums() etc. for common operations — they are faster.

Is apply() slower than vectorized operations?

Yes. Built-in vectorized functions like colSums(), rowMeans(), and pmax() are implemented in C and are much faster than apply(). Use apply() for custom functions, but prefer built-in alternatives when they exist.

What's Next?

  1. R Warning: longer object length is not a multiple of shorter — vector recycling
  2. R Error: object 'x' not found — variable not found troubleshooting
  3. R Common Errors — the full reference of 50 common errors