Order Statistics in R: Min, Max, Sample Quantiles — Theory & Simulation
Order statistics are the values of your sample once you sort it from smallest to largest. The minimum, the maximum, the median, and every sample quantile are all order statistics — and each one has a known probability distribution that you can derive on paper and verify by simulation in R.
What exactly is an order statistic?
Take any sample of n numbers and sort them. The smallest value is called the first order statistic, written $X_{(1)}$. The largest is the n-th order statistic, $X_{(n)}$. In between sit $X_{(2)}, X_{(3)}, \ldots, X_{(n-1)}$, in sorted order. The subscript is always in parentheses to distinguish $X_{(k)}$ (the k-th smallest) from $X_k$ (the k-th sample point in the order you collected it). Let's see all of them in one sample.
The sorted vector is the vector of order statistics. The first entry is the sample minimum, the last is the sample maximum, and the middle entries are the sample quartiles, median, and every other quantile you might want. This single reframe — "sorted sample = ordered statistics" — is the whole foundation. Everything that follows is about the distribution each of these sorted values follows across many repeated samples.
Try it: Draw a sample of 12 values from rexp(12, rate = 1) and extract its 7th order statistic into ex_x7.
Click to reveal solution
Explanation: sort() returns the values in ascending order; indexing with [7] picks out the 7th-smallest value, which is $X_{(7)}$ by definition.
How is the maximum distributed?
The maximum is the most intuitive order statistic to tackle first. Ask: what is the probability that $X_{(n)}$ is at most $x$? That event happens exactly when every value in the sample is at most $x$ — if even one exceeded $x$, the max would too. Because sample points are independent, you can multiply their probabilities together.
$$F_{X_{(n)}}(x) = P(X_{(n)} \le x) = [F(x)]^n$$
Where:
- $F_{X_{(n)}}(x)$ = CDF of the sample maximum
- $F(x)$ = CDF of the underlying distribution each $X_i$ was drawn from
- $n$ = sample size
The CDF gets pushed to the right as $n$ grows, because the max of more draws tends to be larger. Let's verify this by simulating 10,000 samples of size 5 from a uniform distribution on $[0, 1]$ — where $F(x) = x$ — and compare the empirical CDF to the theoretical $x^5$.
The blue step function (empirical) hugs the red curve (theoretical $x^5$) across the whole $[0,1]$ range. Because $x^5$ stays near zero for most of the interval before shooting up near 1, the maximum of 5 uniform draws is heavily skewed toward 1 — exactly what intuition says. If you doubled $n$, the red curve would press even further right, and most maxima would cluster near the upper bound.
Try it: Simulate the distribution of the maximum of 20 rexp(20, rate = 1) samples, then find the 95th percentile of those maxima.
Click to reveal solution
Explanation: replicate() repeats the "draw 20 from Exp(1), take max" experiment 10,000 times. The 95th percentile of those maxima tells you the level the sample max exceeds only 5% of the time — a practical threshold for extreme-value thinking.
How is the minimum distributed?
The minimum is symmetric to the maximum in its logic. The event "$X_{(1)} > x$" happens exactly when every sample point exceeds $x$, so $P(X_{(1)} > x) = [1 - F(x)]^n$. Flip the probability to get the CDF.
$$F_{X_{(1)}}(x) = 1 - [1 - F(x)]^n$$
Where:
- $F_{X_{(1)}}(x)$ = CDF of the sample minimum
- $F(x)$ = CDF of the underlying distribution
- $n$ = sample size
For uniform $[0, 1]$ samples, this becomes $1 - (1-x)^n$, which you may recognise as the CDF of a $\text{Beta}(1, n)$ random variable. That connection — uniform order statistics follow beta distributions — is the key that unlocks the general case. Let's verify the minimum of a size-5 uniform sample against a $\text{Beta}(1, 5)$ density.
The histogram matches the red $\text{Beta}(1, 5)$ density almost perfectly. The distribution of the minimum concentrates near zero, which makes sense: the smaller of five uniform values is pulled toward the low end. Its theoretical mean is $1/(n+1) = 1/6 \approx 0.167$, and the simulated mean will land right there.
pbeta() and qbeta().Try it: Simulate 10,000 minima of size-10 Uniform(0,1) samples. Compare the sample mean to the theoretical mean $1/(n+1)$.
Click to reveal solution
Explanation: The min of 10 uniforms has mean $1/11 \approx 0.0909$, and the simulated average confirms it. The tight agreement shows the theory is not just asymptotic — it's exact.
How is the k-th order statistic distributed?
Min and max are the two easy cases. The general k-th order statistic density is uglier to write down but follows the same logic: for $X_{(k)}$ to equal $x$, you need exactly $k-1$ sample points below $x$, one point at $x$, and $n-k$ points above it. Count the ways this can happen ($n! / ((k-1)!(n-k)!)$) and multiply by the probabilities.
$$f_{X_{(k)}}(x) = \frac{n!}{(k-1)!\,(n-k)!}\, [F(x)]^{k-1}\, [1 - F(x)]^{n-k}\, f(x)$$
Where:
- $f_{X_{(k)}}(x)$ = density of the k-th order statistic at $x$
- $F(x)$, $f(x)$ = CDF and density of the underlying distribution
- $n$ = sample size
- $k$ = rank (1 for min, $n$ for max)
For a Uniform(0,1) sample, $F(x) = x$ and $f(x) = 1$, so the density reduces to the Beta(k, n-k+1) density. This is the master result: every uniform order statistic is a beta random variable with shape parameters $k$ and $n-k+1$. Let's confirm it by simulating $X_{(2)}$, $X_{(5)}$, and $X_{(8)}$ for $n = 10$ and overlaying the corresponding Beta densities.
Three histograms, three perfect fits. $X_{(2)}$ sits low (it's close to the min), $X_{(5)}$ centres near 0.5 (it's the near-median for $n = 10$), and $X_{(8)}$ sits high (close to the max). Each follows a $\text{Beta}(k, n-k+1)$ distribution exactly. The mean of $X_{(k)}$ is $k / (n+1)$ — for $k = 5$, $n = 10$, that's $5/11 \approx 0.455$, a slight bias toward the left because we're below the sample median point for odd-sized samples.
Try it: For a sample of size $n = 15$ from Uniform(0,1), what Beta parameters describe $X_{(5)}$? Verify by simulation: the theoretical mean is $5/16 = 0.3125$.
Click to reveal solution
Explanation: $X_{(5)}$ from n=15 uniforms follows $\text{Beta}(5, 11)$. Its mean is $5/(15+1) = 0.3125$. Simulation confirms to three decimals.
How do sample quantiles relate to order statistics?
Here's where the abstract theory gets practical. Every time you call quantile(x, 0.75) in R, you are computing a weighted average of order statistics. The 0% quantile is literally $X_{(1)}$. The 100% quantile is $X_{(n)}$. The median is the middle order statistic (or the average of the two middle ones). All nine type options in R are just different rules for interpolating between consecutive order statistics.
The 0% and 100% quantiles are always the min and max, no matter which type you pass — every rule agrees at the endpoints. The median of an odd-sized sample is always $X_{((n+1)/2)}$, again regardless of type. The nine quantile() types only differ in between, and mainly in how they handle non-integer rank positions.
type = 7, which interpolates between order statistics using $(i - 1)/(n - 1)$. type = 4 is pure linear interpolation between order statistics, and type = 1 is a step function picking individual order statistics. All use order statistics under the hood.For the 60th percentile of this 10-value sample, types 1 and 4 both return 8 (the 6th order statistic, since 60% × 10 = 6). Type 7 returns 8.4 — a linear blend of the 6th and 7th order statistics. None of these is wrong; they're different conventions for the same underlying question: which order statistic, or combination of them, estimates the population's 60th percentile? If you ever compare R's quantiles to SAS or Python's NumPy, know that SAS defaults to type 3 and NumPy defaults to type 7.
type = 7; SAS default is type 3; Python NumPy default is type 7 (matches R); Excel's PERCENTILE.INC matches R's type 7. If your numbers don't match across tools, check the quantile type first.Try it: For type = 1 (step function, no interpolation), the 25th percentile of 1:20 should equal the 5th order statistic. Compute it by hand from the order statistic formula, then check with quantile(..., type = 1).
Click to reveal solution
Explanation: Type 1 is the discrete step function: it picks $X_{(\lceil p n \rceil)}$. For $p = 0.25$ and $n = 20$, that's $X_{(5)} = 5$. No interpolation, just plain order statistic lookup.
Where do order statistics show up in practice?
Three places show up over and over. The first is non-parametric confidence intervals for the median — you don't need the population to be normal, because you can bracket the median with two order statistics and know the coverage probability from the binomial distribution. The second is extreme value analysis, where the maximum of a sample (annual flood height, peak wind speed) is the object of direct interest. The third is robust statistics, where trimmed means and ranks replace sensitive averages.
Let's build the non-parametric CI for the median from scratch. If you pick order statistics $X_{(l)}$ and $X_{(u)}$, the probability that they bracket the true population median equals the probability that at least $l$ and at most $u - 1$ of your sample points fall below the median — a binomial tail probability with success probability 0.5 (since the median is defined as the 50% mark).
The distribution-free 95% CI for the population median is $[X_{(6)}, X_{(15)}]$ — entirely built from two order statistics and a binomial probability. No normality assumption required. The coverage is 95.86% (slightly above 95% because we can only hit discrete levels), which is how the Wilcoxon signed-rank confidence interval is constructed in practice. Contrast this with the "mean ± 1.96 × SE" interval, which assumes the sampling distribution is approximately normal — if you can't justify that assumption, this order-statistic approach still works.
wilcox.test(..., conf.int = TRUE) gets its CI for the median. If you peek under the hood, you'll see it's working with rank and order statistics all the way down.Try it: For a sample of size $n = 30$, find the pair $(l, u)$ that gives at least 95% coverage for the median CI using order statistics.
Click to reveal solution
Explanation: The symmetric bounds give 95.72% coverage, the smallest pair that clears 95%. The non-parametric CI for the median of a size-30 sample is $[X_{(10)}, X_{(21)}]$.
Practice Exercises
Exercise 1: A reusable order statistic simulator
Write a function simulate_order_stat(k, n, dist, reps) where dist is a string like "norm", "exp", or "unif". The function should draw reps samples of size n from that distribution, return the k-th order statistic of each. Use it to simulate $X_{(3)}$ from $n = 8$ Normal(0,1) samples and plot the density. Save the simulated values to my_x3.
Click to reveal solution
Explanation: get(paste0("r", dist)) looks up the random generator by name. The density peaks below zero because $X_{(3)}$ of 8 normals is the third-smallest of eight draws — in the left tail region. The theoretical mean is tabulated at roughly −0.49 (Arnold et al. 2008).
Exercise 2: Correlation between two order statistics
For Uniform(0,1) samples, the correlation between $X_{(i)}$ and $X_{(j)}$ (with $i < j$) has a known closed form:
$$\text{Cor}(X_{(i)}, X_{(j)}) = \sqrt{\frac{i\,(n-j+1)}{j\,(n-i+1)}}$$
Simulate 10,000 samples of $n = 10$ uniforms, extract $X_{(3)}$ and $X_{(7)}$ from each, and check that the empirical correlation matches the theoretical value. Save the result to my_corr and the theory to my_theo_corr.
Click to reveal solution
Explanation: Order statistics from the same sample are positively correlated — if $X_{(3)}$ is high, the whole sample is shifted up and $X_{(7)}$ tends to be high too. The simulation matches theory to four decimals, which is the kind of agreement that builds deep trust in the formula.
Exercise 3: Expected maximum of a normal sample
For $n = 20$ Normal(0,1) samples, the expected value of the maximum is not a textbook closed form, but simulation gives it instantly. Estimate $E[X_{(20)}]$ by averaging 10,000 simulated maxima. The published value is about 1.8675 (from normal order statistic tables, Harter 1961). Save the simulated maxima to my_maxes and the estimated expectation to my_emean.
Click to reveal solution
Explanation: 10,000 replications give a Monte Carlo standard error of about $\sigma / \sqrt{10000} \approx 0.005$, so the estimate is within noise of the published value. When no closed form exists, simulation plus enough replications delivers a tight numerical answer.
Complete Example: Analysing Annual River Flood Maxima
Put everything together on a realistic scenario. Suppose you have daily river flow data for 50 years. Each year, only the annual maximum flow matters for flood-risk planning. The sample maximum of a year's data is an order statistic, and across years you build up a sampling distribution of maxima that can be summarised — mean, quantiles, return periods — using the tools above.
The annual maxima average around 600 units, with the empirical "100-year flood" (the 99th percentile — a quantile built from order statistics) at about 845. The non-parametric 95% CI for the median annual flood is $[X_{(18)}, X_{(33)}] = [570.5, 623.4]$ — no normality assumption needed. Every statistic here — median, 99th percentile, CI bounds — is an order statistic or derived from one. The whole risk-analysis pipeline runs on order statistics.
Summary
| Concept | Formula | R idiom | Typical use |
|---|---|---|---|
| $X_{(1)}$ (min) | $F_{X_{(1)}}(x) = 1 - [1 - F(x)]^n$ | min(x) or sort(x)[1] |
Lower bound, extremes |
| $X_{(n)}$ (max) | $F_{X_{(n)}}(x) = [F(x)]^n$ | max(x) or sort(x)[n] |
Flood/wind peaks |
| $X_{(k)}$ (k-th) | density $\propto F(x)^{k-1}(1-F(x))^{n-k} f(x)$ | sort(x)[k] |
General OS problems |
| $X_{(k)}$ uniform | $\text{Beta}(k, n-k+1)$ | rbeta(reps, k, n-k+1) |
Theoretical benchmark |
| Sample quantile | weighted avg of two consecutive $X_{(k)}$ | quantile(x, p, type = t) |
Percentiles, CIs |
| NP median CI | $[X_{(l)}, X_{(u)}]$ with binomial coverage | wilcox.test(..., conf.int = TRUE) |
Distribution-free inference |
The one mental model that covers all of it: sort your sample, and every quantile, every extreme, every robust statistic is a function of the sorted values. Simulate the sorting across many samples and you recover the entire distributional theory empirically.
References
- Wikipedia — Order statistic. Link
- Siegrist, K. — Probability, Mathematical Statistics, and Stochastic Processes, Order Statistics chapter. Link
- Hyndman, R. J., & Fan, Y. (1996). "Sample quantiles in statistical packages." The American Statistician, 50(4), 361–365.
- stat.ethz.ch —
quantile()reference. Link - Arnold, B. C., Balakrishnan, N., & Nagaraja, H. N. (2008). A First Course in Order Statistics. SIAM Classics in Applied Mathematics.
- David, H. A., & Nagaraja, H. N. (2003). Order Statistics, 3rd ed. Wiley.
- Casella, G., & Berger, R. L. (2002). Statistical Inference, 2nd ed., Chapter 5.4. Duxbury.
- Harter, H. L. (1961). "Expected values of normal order statistics." Biometrika, 48, 151–165.
Continue Learning
- Sampling Distributions in R — parent post; order statistics are the sampling distribution of sorted values.
- Central Limit Theorem in R — complementary result about the sampling distribution of the mean (not an order statistic, but closely compared).
- Fitting Distributions to Data in R — a parametric counterpart to the non-parametric, order-statistic approach used here for quantile inference.