The Central Limit Theorem: Why the Normal Distribution Is Everywhere
We state and prove the Central Limit Theorem — the reason the bell curve appears throughout nature, science, and statistics — and explore its assumptions, generalizations, and applications.
The Theorem
The Central Limit Theorem (Lindeberg-Lévy)
Let be independent and identically distributed random variables with mean and finite variance . Let . Then:
In words: regardless of the distribution of the individual , their normalized sum converges in distribution to a standard normal. This is why the bell curve appears everywhere — it is the universal attractor for sums of independent random variables.
What Does "Convergence in Distribution" Mean?
The notation means that for every where the CDF of is continuous:
where is the standard normal CDF.
Intuition
Why Sums Become Normal
Consider rolling a single die — the distribution is uniform on . Now roll dice and sum them:
- : flat distribution (uniform)
- : triangular distribution
- : already visibly bell-shaped
- : nearly indistinguishable from a Gaussian
The CLT explains this universality: the specific shape of the original distribution is "washed out" by summation. Only the mean and variance survive in the limit.
A Precise Statement
If is the sample mean, the CLT equivalently says:
or in the approximate form used in practice:
Proof via Characteristic Functions
The most elegant proof uses characteristic functions (Fourier transforms of probability distributions).
Proof.
Step 1 — Setup. Without loss of generality, assume and (replace by ). We must show:
Step 2 — Characteristic function of . The characteristic function of is . By independence:
Step 3 — Taylor expansion. Since and :
Substituting :
Step 4 — Take the limit.
The function is the characteristic function of .
Step 5 — Apply Lévy's continuity theorem. Since pointwise and is continuous at , we conclude .
The Berry-Esseen Theorem
The CLT says the distribution converges — but how fast?
Berry-Esseen Theorem
If , then:
where is a universal constant.
The error is — so for practical purposes, often gives a good normal approximation.
Generalizations
Lindeberg CLT (Non-Identical Distributions)
If are independent (but not necessarily identically distributed) with , , , and the Lindeberg condition holds:
then .
Multivariate CLT
If are i.i.d. with mean and covariance matrix , then:
CLT for Dependent Variables
Under various mixing conditions, CLT-type results hold for weakly dependent sequences — essential in time series analysis and ergodic theory.
When the CLT Fails
The CLT requires finite variance. If , the theorem fails. For example:
- Cauchy distribution: , then is still Cauchy — no convergence to normal.
- Stable distributions: For heavy-tailed distributions with infinite variance, normalized sums converge to non-Gaussian stable laws.
The generalized CLT states that the only possible limits of normalized sums are the -stable distributions with (the Gaussian corresponds to ).
Applications
Polling and Surveys
If you survey people, the sample proportion satisfies:
A confidence interval is , a direct application of the CLT.
Hypothesis Testing
Most standard statistical tests (-test, -test for large ) rely on the CLT to justify using normal critical values.
Finance
The Black-Scholes model assumes log-returns are normally distributed — justified by viewing daily returns as sums of many small, roughly independent shocks. (When the independence or finite-variance assumptions fail, as in financial crises, the model breaks down.)
Physics
The Maxwell-Boltzmann distribution of molecular velocities in a gas arises because each velocity component is a sum of many independent random impulses — the CLT in action.
Historical Development
- De Moivre (1733) proved the CLT for coin flips: the binomial distribution converges to a normal.
- Laplace (1812) extended this to general and recognized the broader principle.
- Chebyshev (1887) and Markov (1898) gave proofs using the method of moments.
- Lyapunov (1901) proved the CLT under his condition () using characteristic functions.
- Lindeberg (1922) gave the definitive condition for non-identical variables.
- Feller (1935) proved the Lindeberg condition is also necessary (in a certain sense).
Summary
References
- Billingsley, P., Probability and Measure, 3rd edition, Wiley, 1995.
- Feller, W., An Introduction to Probability Theory and Its Applications, Vol. 2, Wiley, 1971.
- Durrett, R., Probability: Theory and Examples, 5th edition, Cambridge University Press, 2019. Free PDF
- Wikipedia — Central limit theorem
- 3Blue1Brown — "But what is the Central Limit Theorem?"
- MIT OpenCourseWare — Probability and Statistics