Lesson 1.11: Masterclass on Continuous MGFs

This lesson completes our single-variable toolkit by applying the Moment Generating Function (MGF) to the continuous domain. We provide full analytical derivations for the MGFs of the Uniform, Exponential, and the all-important Normal distribution. We will then use these 'fingerprints' to generate their key moments, laying the rigorous mathematical bedrock for the Central Limit Theorem and advanced risk modeling.

Part 1: Upgrading the MGF for the Continuous World

The Core Principle Remains the Same

The definition of the MGF is universal: MX(t)=E[etX]M_X(t) = E[e^{tX}]. The only thing that changes is the tool we use to calculate the expectation. We swap summation for integration.

Definition: MGF for Continuous R.V.s

For a continuous random variable XX with PDF f(x)f(x), the MGF is:

MX(t)=etxf(x)dxM_X(t) = \int_{-\infty}^{\infty} e^{tx} f(x) \, dx

The Moment Generation Rule

The magic of the MGF is that once the (often difficult) integration is done, finding moments becomes a simple act of differentiation. The core property is unchanged:

E[Xk]=MX(k)(0)=dkdtkMX(t)t=0E[X^k] = M^{(k)}_X(0) = \left. \frac{d^k}{dt^k} M_X(t) \right|_{t=0}

Part 2: Rigorous Derivations of Foundational MGFs

1. The Uniform(a,ba, b) Distribution (The Warm-up)
PDF is f(x)=1/(ba)f(x) = 1/(b-a) for x[a,b]x \in [a, b].

Uniform MGF Derivation

MX(t)=abetx1badx=1ba[etxt]abM_X(t) = \int_a^b e^{tx} \frac{1}{b-a} dx = \frac{1}{b-a} \left[ \frac{e^{tx}}{t} \right]_a^b
MX(t)=etbetat(ba)M_X(t) = \frac{e^{tb} - e^{ta}}{t(b-a)}

Uniform Moment Derivations

Finding derivatives requires L'Hopital's rule. It's complex, so we often state the moments which were derived more easily in Lesson 1.9.

Mean: E[X]=(a+b)/2E[X] = (a+b)/2

Variance: Var(X)=(ba)2/12\text{Var}(X) = (b-a)^2/12

Skewness: 00 (The distribution is perfectly symmetric).

Excess Kurtosis: 6/5-6/5 (It has "thinner" tails than a Normal distribution).

2. The Exponential(λ\lambda) Distribution (The Workhorse)
PDF is f(x)=λeλxf(x) = \lambda e^{-\lambda x} for x0x \ge 0.

Exponential MGF Derivation

MX(t)=0etx(λeλx)dx=λ0e(tλ)xdxM_X(t) = \int_0^\infty e^{tx} (\lambda e^{-\lambda x}) dx = \lambda \int_0^\infty e^{(t-\lambda)x} dx
=λ[e(tλ)xtλ]0= \lambda \left[ \frac{e^{(t-\lambda)x}}{t-\lambda} \right]_0^\infty

For this integral to converge, the exponent must be negative, so we require tλ<0    t<λt-\lambda < 0 \implies t < \lambda.

=λtλ(0e0)=λtλ(1)= \frac{\lambda}{t-\lambda} (0 - e^0) = \frac{\lambda}{t-\lambda}(-1)
MX(t)=λλtM_X(t) = \frac{\lambda}{\lambda - t}

Exponential Moment Derivations

Let's find the first two moments by differentiation. We'll write M(t)=λ(λt)1M(t) = \lambda(\lambda-t)^{-1}.

Mean:

MX(t)=(1)λ(λt)2(1)=λ(λt)2M'_X(t) = (-1)\lambda(\lambda-t)^{-2}(-1) = \lambda(\lambda-t)^{-2}
E[X]=MX(0)=λ(λ)2=1/λE[X] = M'_X(0) = \lambda(\lambda)^{-2} = 1/\lambda

Variance:

MX(t)=(2)λ(λt)3(1)=2λ(λt)3M''_X(t) = (-2)\lambda(\lambda-t)^{-3}(-1) = 2\lambda(\lambda-t)^{-3}
E[X2]=MX(0)=2λ(λ)3=2/λ2E[X^2] = M''_X(0) = 2\lambda(\lambda)^{-3} = 2/\lambda^2
Var(X)=E[X2](E[X])2=2λ2(1λ)2=1λ2\text{Var}(X) = E[X^2] - (E[X])^2 = \frac{2}{\lambda^2} - \left(\frac{1}{\lambda}\right)^2 = \frac{1}{\lambda^2}
3. The Normal(μ,σ2\mu, \sigma^2) Distribution (The Main Event)
The most important derivation for a Quant or ML professional.

Normal MGF Derivation (No-Skip Masterclass)

We start with the MGF of a standard normal ZN(0,1)Z \sim N(0,1). Its PDF is ϕ(z)=12πez2/2\phi(z) = \frac{1}{\sqrt{2\pi}}e^{-z^2/2}.

MZ(t)=E[etZ]=etz1sqrt2πez2/2dzM_Z(t) = E[e^{tZ}] = \int_{-\infty}^\infty e^{tz} \frac{1}{sqrt{2\pi}}e^{-z^2/2} dz

Step 1: Combine the exponents.

=1sqrt2πexp(12(z22tz))dz= \frac{1}{sqrt{2\pi}} \int_{-\infty}^\infty \exp\left(-\frac{1}{2}(z^2 - 2tz)\right) dz

Step 2: The "Completing the Square" Trick. We want to make the exponent look like (zt)2/2=(z22tz+t2)/2-(z-t)^2/2 = -(z^2 - 2tz + t^2)/2. To do this, we add and subtract t2t^2 inside the parenthesis.

=1sqrt2πexp(12(z22tz+t2t2))dz= \frac{1}{sqrt{2\pi}} \int_{-\infty}^\infty \exp\left(-\frac{1}{2}(z^2 - 2tz + t^2 - t^2)\right) dz
=1sqrt2πexp((zt)22+t22)dz= \frac{1}{sqrt{2\pi}} \int_{-\infty}^\infty \exp\left(-\frac{(z-t)^2}{2} + \frac{t^2}{2}\right) dz

Step 3: Separate the exponents and factor out the constant. The et2/2e^{t^2/2} term is a constant with respect to zz.

=et2/21sqrt2πexp((zt)22)dz= e^{t^2/2} \int_{-\infty}^\infty \frac{1}{sqrt{2\pi}} \exp\left(-\frac{(z-t)^2}{2}\right) dz

Step 4: Recognize the remaining integral. The entire integral that remains is the PDF of a Normal distribution with mean tt and variance 1. The total area under any PDF is exactly 1.

MZ(t)=et2/2cdot(1)=et2/2M_Z(t) = e^{t^2/2} cdot (1) = e^{t^2/2}

Step 5: Generalize to XN(μ,σ2)X \sim N(\mu, \sigma^2). We use the property X=μ+σZX = \mu + \sigma Z and MaX+b(t)=etbMX(at)M_{aX+b}(t) = e^{tb}M_X(at).

MX(t)=Mmu+sigmaZ(t)=etmuMZ(sigmat)=etmue(sigmat)2/2M_X(t) = M_{mu + sigma Z}(t) = e^{tmu} M_Z(sigma t) = e^{tmu} e^{(sigma t)^2/2}
MX(t)=emut+sigma2t2/2M_X(t) = e^{mu t + sigma^2 t^2 / 2}

Normal Moment Derivations

Let M(t)=eμt+σ2t2/2M(t) = e^{\mu t + \sigma^2 t^2 / 2}.

Mean:

M(t)=M(t)cdot(mu+sigma2t)M'(t) = M(t) cdot (mu + sigma^2 t)
E[X]=M(0)=e0(mu+0)=muE[X] = M'(0) = e^0(mu + 0) = mu

Variance:

M(t)=M(t)(mu+sigma2t)+M(t)(sigma2)M''(t) = M'(t)(mu + sigma^2 t) + M(t)(sigma^2)
E[X2]=M(0)=(mu)(mu)+(1)(sigma2)=mu2+sigma2E[X^2] = M''(0) = (mu)(mu) + (1)(sigma^2) = mu^2 + sigma^2
Var(X)=(mu2+sigma2)(mu)2=sigma2\text{Var}(X) = (mu^2 + sigma^2) - (mu)^2 = sigma^2

Skewness: The distribution is perfectly symmetric, so its standardized skewness is **0**.

Kurtosis: The fourth central moment is E[(Xμ)4]=3σ4E[(X-\mu)^4] = 3\sigma^4. This gives a kurtosis of 3, and an **excess kurtosis of 0**. This is the benchmark against which all other distributions' "tail-heaviness" is measured.

A Note on 'Fat-Tailed' Distributions

    You might wonder about distributions like the Student's t or Lognormal. A fascinating and critical feature of these distributions is that their MGFs **do not exist** in the traditional sense, because the defining integral E[etX]E[e^{tX}] diverges (goes to infinity). This mathematical fact is the very reason they are useful for modeling extreme financial risk—their "moments" can be infinite, which is the definition of a "fat tail." We will explore these in Module 2.

What's Next? From One to Many

We have reached the summit of single-variable probability theory. We have a complete toolbox of discrete and continuous distributions and a master tool (the MGF) for analyzing them with calculus.

But the real world is not about one variable; it's about how multiple variables interact. How does one stock's return relate to another's? This is the domain of multi-variable probability, and it's the gateway to understanding correlation, covariance, and regression.