A Physicist's Proof of the Central Limit Theorem
Suppose we are given n independent, identically distributed (i.i.d.) random variables Y1,Y2,…,Yn. We are interested in the distribution of
ˉY=Y1+Y2+…+YnnSuppose, the probability distribution function (p.d.f.) of Yi is f:R→R and the p.d.f of ˉY is g:R→R. Then,
g(ˉy;n)=P[ˉY=ˉy]=∫∞−∞dy1dy2…dynf(y1)f(y2)…f(yn)δ(ˉy−y1+y2+…+ynn)
where δ is the Dirac delta function defined as: δ(x−x′)={∞x = x'0else
such that ∫x′∈Sdxδ(x−x′)=1
A Dirac delta exists to be integrated over i.e.
∫x′∈Sdxδ(x−x′)f(x)=f(x′)
If we were to use this property to integrate in equation 2, then we would lose one of the integrals, say y1 and lose the symmetry between the yi terms. To preserve that symmetry, we’ll use another representation of the delta function.
As motivation, consider the Fourier transform of a function f: ˜f(k)=∫∞−∞dx√2πe−ikxf(x)
The inverse Fourier transform is given by: f(x)=∫∞−∞dk√2πeikx˜f(k)
We can plug in the definition of the Fourier transform in equation 4 into equation 5 to get:
f(x)=∫∞−∞dk√2πeikx˜f(k)=∫∫dk√2πdx′√2πeikxe−ikx′f(x′)=∫dx′δ(x−x′)f(x′)where δ(x−x′)=∫∞−∞dk2πeik(x−x′)
is exactly the Dirac delta we defined earlier.
We will now use this representation of the delta function in equation 2. Limits on integrals are from −∞ to ∞ and won’t be explicitly written.
g(ˉy;n)=P[ˉY=ˉy]=∫dy1dy2…dynf(y1)f(y2)…f(yn)δ(ˉy−y1+y2+…+ynn)=∫dy1dy2…dyndk2πf(y1)f(y2)…f(yn)eik(ˉy−y1+y2+…+ynn)=∫dk2πeikˉy∫dy1f(y1)e−iky1/n∫dy2f(y2)e−iky2/n…∫dynf(yn)e−ikyn/n=∫dk2πeikˉy[∫dy1f(y1)e−iky1/n]nEffectively, apart from scalings and some normalization terms, this states that the p.d.f of the distribution of the mean is the inverse Fourier transform of the nth power of the Fourier transform of the individual p.d.f. of the Yis.
Of course, one can choose explicit functional forms for f(y) i.e. the p.d.f. of the Yis but we want a more general solution. One possible solution is to expand f(y) in a series that can then be truncated for large n.
In general, given f, we can compute all the moments: ⟨Xn⟩=∫dxxnf(x)
as well as the central moments: ⟨Xn⟩_c=∫dx(x−μ)nf(x)
where μ=⟨X⟩ is the first moment and assumed to be finite. Note that ⟨X⟩_c=0.
On the other hand, suppose we are given all the moments. Can we compute f then? The answer to this question is positive as shown below.
Suppose, we are given all moments defined by equation \ref {moments}. We can rewrite those as: ⟨Xn⟩=∫dxxnf(x)=∫dxdnd(−ik)n|k=0e−ikxf(x)=dnd(−ik)n|k=0∫dxe−ikxf(x)=dnd(−ik)n|k=0√2π˜f(k)
In other words, knowing the moments is equivalent to knowing the derivatives of the Fourier transform of the p.d.f. at k=0. (−i)n√2π⟨Xn⟩=dn˜fdkn(0)
So, we know the Taylor expansion of ˜f: ˜f(k)=˜f(0)+d˜f(0)dkk+d2˜f(0)dk2k22!+…+dm˜f(0)dkmkmm!+…=1√2π−i⟨X⟩√2πk−⟨X2⟩√2πk22!+…+(−i)m⟨Xm⟩√2πkmm!+…
which can be inverted to give the original p.d.f. f. We have used the fact that ˜f(0)=∫dx√2πf(x)=1√2π since f is a p.d.f.
We can also do the same expansion using the central moments instead: ⟨Xn⟩c=∫dx(x−μ)nf(x)=∫dxdnd(−ik)n|k=0e−ik(x−μ)f(x)=dnd(−ik)n|k=0∫dxe−ik(x−μ)f(x)=dnd(−ik)n|k=0eikμ∫dxe−ikxf(x)=dnd(−ik)n|k=0eikμ√2π˜f(k)
In other words, the moments give us the Taylor series terms in the expansion of eikμ˜f(k) instead of just ˜f(k) before. So,
Define ˜g(k)=eikμ˜f(k).
˜g(k)=˜g(0)+d˜g(0)dkk+d2˜g(0)dk2k22!+…+dm˜g(0)dkmkmm!+…=1√2π−i⟨X⟩_c√2πk−⟨X2⟩_c√2πk22!+…+(−i)m⟨Xm⟩_c√2πkmm!+…using ˜g(0)=˜f(0)=1√2π.
Finally,
˜f(k)=e−ikμ(1√2π−i⟨X⟩_c√2πk−⟨X2⟩_c√2πk22!+…+(−i)m⟨Xm⟩_c√2πkmm!+…)We will use central moments from here on but the same result can be derived by using the non-central moments. We can use this result in equation 2:
g(ˉy;n)=∫dk2πeikˉy[∫dy1f(y1)e−iky1/n]n=∫dk2πeikˉy[√2π˜f(kn)]n=∫dk2πeikˉy[e−ikμ/n√2πn(1√2π−i⟨X⟩_c√2πkn−⟨X2⟩_c√2π12!(kn)2+…+(−i)m⟨Xm⟩_c√2π1m!(kn)m+…)]n=12π∫dkeik(ˉy−μ)(1−i⟨X⟩_ckn−⟨X2⟩_c2(kn)2+i⟨X3⟩_c3!(kn)3+⟨X4⟩_c4!(kn)4…)nMore precisely, we want the limit
g(ˉy;n)=limn→∞limΛ→∞12π∫Λ−Λdkeik(ˉy−μ)(1−i⟨X⟩_ckn−⟨X2⟩_c2(kn)2+i⟨X3⟩_c3!(kn)3+⟨X4⟩_c4!(kn)4…)n=limΛ→∞limn→∞12π∫Λ−Λdkeik(ˉy−μ)(1−i⟨X⟩_ckn−⟨X2⟩_c2(kn)2+i⟨X3⟩_c3!(kn)3+⟨X4⟩_c4!(kn)4…)n=limΛ→∞12π∫Λ−Λlimn→∞dkeik(ˉy−μ)(1−i⟨X⟩_ckn−⟨X2⟩_c2(kn)2+i⟨X3⟩_c3!(kn)3+⟨X4⟩_c4!(kn)4…)nIn particular, kn can be made arbitrarily small this way. For large enough n, ‖.
Note, \big<X\big>_c = 0 and defining, \big<X^2\big>_c \equiv \sigma^2. g(\bar{y}) = \frac{1}{2\pi} \int_{-\infty}^{\infty} dk\, e^{ik(\bar{y}-\mu)}\big[1 - \frac{\sigma^2}{2} (\frac{k}{n})^2+ \mathcal{O}((\frac{k}{n})^3)\big]^n
Using 1 - x \approx e^x for small x, we get g(\bar{y}) = \frac{1}{2\pi} \int_{-\infty}^{\infty} dk\, e^{ik\bar{y}}e^{-ik\mu}e^{-\frac{\sigma^2 k^2}{2n}}
which is exactly the inverse Fourier transform of the Fourier transform of a Gaussian with mean \mu and variance \frac{\sigma^2}{n}.
The argument of the exponential in the integral is: \begin{split} \text{arg} &= ik\bar{y}-ik\mu-\frac{\sigma^2 k^2}{2n} \\ & = -\frac{\sigma^2 k^2}{2n} + ik(\bar{y}-\mu) \\ & = -\frac{\sigma^2}{2n} \big[k^2 + \frac{i2n(\mu-\bar{y})}{\sigma^2}k\big] \\ & = -\frac{\sigma^2}{2n} \big[\big(k + \frac{in(\mu-\bar{y})}{\sigma^2}\big)^2 - \frac{i^2n^2(\mu-\bar{y})^2}{\sigma^4}\big] \\ & = -\frac{\sigma^2}{2n} \big(k + \frac{in(\mu-\bar{y})}{\sigma^2}\big)^2 -\frac{n(\bar{y}-\mu)^2}{2\sigma^2} \end{split}
So, the integral is: \begin{split} g(\bar{y}) &= \frac{1}{2\pi} \int dk\, e^{ik\bar{y}}e^{-ik\mu}e^{-\frac{\sigma^2 k^2}{2n}} \\ & = \frac{1}{2\pi} \int dk\, e^{-\frac{\sigma^2}{2n} \big(k + \frac{in(\mu-\bar{y})}{\sigma^2}\big)^2} e^{-\frac{n(\bar{y}-\mu)^2}{2\sigma^2}} \\ & = e^{-\frac{n(\bar{y}-\mu)^2}{2\sigma^2}} \frac{1}{2\pi} \int dk\, e^{-\frac{\sigma^2}{2n} \big(k + \frac{in(\mu-\bar{y})}{\sigma^2}\big)^2} \\ \end{split}
The integral is a simple Gaussian integral that integrates to \sqrt{\frac{\pi}{\sigma^2 / 2n}}. So, we get g(\bar{y}) = e^{-\frac{n(\bar{y}-\mu)^2}{2\sigma^2}} \frac{1}{2\pi} \sqrt{\frac{\pi}{\sigma^2 / 2n}} = \frac{1}{\sqrt{2\pi\sigma^2 / n}} e^{-\frac{(\bar{y}-\mu)^2}{2\sigma^2 / n}}
is exactly a Gaussian p.d.f. with mean = \mu and standard deviation = \frac{\sigma}{\sqrt{n}}.