data.table frank() in R: Fast Ranking of Vectors
data.table frank() in R returns the rank of every value in a vector or column, with seven tie-breaking methods and built-in descending order. It is a faster, more flexible drop-in for base rank().
frank(x) # rank a vector, average ties frank(x, ties.method = "dense") # no gaps after tied values frank(x, ties.method = "min") # all ties get the lowest rank frank(dt, col) # rank a data.table by a column frank(dt, -col) # descending rank frank(.SD, -col) # rank inside a by-group frankv(x, order = -1L) # vector API, descending
Need explanation? Read on for examples and pitfalls.
What frank() does in R
data.table frank() assigns a numeric rank to each value. It takes a vector, list, data.frame, or data.table and returns one rank per element, ordering from smallest to largest by default. The function is part of the data.table package and runs in C, so it stays fast even on millions of rows.
Unlike base rank(), frank() supports every common tie-breaking rule, accepts a data.table directly, and lets you rank by several columns at once. The vector-only variant is frankv().
frank() is the Series method .rank(), and ties.method = "dense" matches pandas method="dense".frank() syntax and arguments
frank() has one data argument and three controls. The signature is frank(x, ..., na.last = TRUE, ties.method = "average"). Each argument changes a specific part of the ranking behaviour.
x: the data to rank. A vector ranks directly. Adata.table,data.frame, or list ranks by the columns named in.......: column names or expressions to rank by whenxis a table. Prefix a column with-to rank it in descending order, as infrank(dt, -mpg).na.last: where missing values go.TRUEranksNAlast,FALSEranks it first, andNAdrops it.ties.method: how equal values share ranks. Options are"average"(default),"first","last","min","max","dense", and"random".
The vector-focused form, frankv(x, cols, order = 1L, ...), is identical except it takes column names as a character vector and an order argument (1L ascending, -1L descending). Use frankv() when you build the column list programmatically.
frank() examples by use case
Start with a plain vector. Pass any numeric vector and frank() returns its ranks. With the default "average" method, tied values share the mean of the ranks they would occupy.
The two 10s would take ranks 1 and 2, so each gets 1.5. The ties.method argument changes that. "dense" never leaves gaps, "min" gives every tie the lowest available rank, and "first" breaks ties by position.
Rank a column inside a data.table. Pass the table as x and the column as .... Prefix the column with - to rank from highest to lowest, which is what "rank 1 = best" usually means.
Rank within groups. Combine frank() with .SD and by to rank rows inside each group. Here every car is ranked by mpg against other cars with the same cylinder count.
frank() vs base rank()
frank() does everything rank() does, plus three things it cannot. Base rank() works on vectors only and lacks the "dense" and "last" tie methods. The table below shows where frank() pulls ahead.
| Feature | frank() | base rank() |
|---|---|---|
| Speed on large vectors | Fast, multi-threaded | Slower |
| ties.method = "dense" | Supported | Not supported |
| Rank a data.table column | Direct | Needs rank(dt$col) |
| Rank by multiple columns | Yes | No |
| Descending order | -col or order = -1L |
rank(-x) only |
For a one-off rank on a short vector, base rank() is fine. Reach for frank() when the data is large, when you need dense ranks, or when ranking happens inside a data.table workflow.
setorder() and frollmean(), so mixing it with base rank() only adds friction.Common pitfalls
Watch the default tie method. With ties.method = "average", tied values return decimals like 3.5. If a later step expects whole numbers, switch to "min" or "dense".
A misspelled tie method errors immediately. Only the seven documented strings are valid.
The fix is to use "min", the method most people mean by "smallest".
na.last = TRUE, a missing value receives a real numeric rank at the end, not NA. Set na.last = NA to drop missing values from the result entirely.Try it yourself
Try it: Rank the cars in mtcars by horsepower (hp) so that the most powerful car is rank 1, breaking ties densely. Save the ranks to ex_ranks.
Click to reveal solution
Explanation: Prefixing hp with - ranks from highest to lowest, so the strongest engine gets rank 1. The "dense" method keeps the ranks as consecutive integers with no gaps after ties.
Related data.table functions
frank() sits next to data.table's ordering and grouping helpers. Use these when ranking is not quite the operation you need.
setorder(): sort the rows of a table in place instead of ranking them.rowid(): number rows 1, 2, 3 within each group.rleid(): assign run-length IDs to consecutive equal values.uniqueN(): count distinct values, often paired with dense ranks.shift(): lag or lead a column, useful for rank-change calculations.
See the official data.table frank reference for the full argument list.
FAQ
What is the difference between frank and rank in R?
frank() is data.table's ranking function and rank() is base R's. They return the same result for a simple vector with the default settings. frank() adds multi-threaded speed, the "dense" and "last" tie methods, direct support for ranking data.table columns, and ranking by several columns at once. Base rank() handles only vectors.
How do I rank in descending order with frank?
Prefix the column or vector with a minus sign: frank(dt, -mpg) ranks the highest mpg as rank 1. For the frankv() API, pass order = -1L instead. Negating works because frank() ranks ascending by default, so flipping the sign flips the order.
What does ties.method = "dense" do?
Dense ranking assigns consecutive integers with no gaps after tied values. If two values tie for rank 1, the next value gets rank 2, not rank 3. This contrasts with "min", which would jump to rank 3. Dense ranks are ideal when you want compact group labels rather than positions.
Can frank rank multiple columns at once?
Yes. Pass several columns to the ... argument: frank(dt, cyl, -mpg) ranks first by cyl ascending, then breaks ties by mpg descending. This mirrors how setorder() sorts by multiple keys, and base rank() cannot do it.
Does frank work on character or date columns?
Yes. frank() ranks any orderable type, including characters (alphabetical), factors (by level), and Date or POSIXct values (chronological). The same ties.method and descending rules apply, so frank(dt, -order_date) ranks the most recent date as rank 1.