Lesson 6.3: Fixing σ - Stochastic Volatility (The Heston Model)

Welcome to Lesson 6.3. In our last two lessons, we learned about Risk-Neutral Valuation and how to use the Monte Carlo 'computer way' to price any option.

However, both of these methods still relied on the Geometric Brownian Motion (GBM) model from Lesson 1.4:

dS_t = r S_t dt + \sigma S_t dW_t^{\mathbb{Q}}

The "Achilles' heel" of this model is that we assume $\sigma$ (volatility) is a constant, known number (e.g., 20% forever).

In the real world, this is completely wrong. Volatility is the most random, "panicky" thing in finance. It is not a constant; it is its own random process. In this lesson, we will prove this assumption is wrong and then build a more advanced, realistic model to "fix" it.

Part 1: The "Why" (The "Volatility Smile" Problem)

How do we know the Black-Scholes-Merton (BSM) model's assumption of constant $\sigma$ is wrong?

The Evidence: The "Volatility Smile."

If the BSM model were 100% correct, then every option on a single stock (e.g., all Apple options) for the same expiration date would be priced with the same volatility, $\sigma$ .

This is not what we see in the real market.

Instead, we see a "smile." If we take the actual market prices of options and use the Black-Scholes formula "in reverse" to see what $\sigma$ the market is using, we find:

"Out-of-the-money" put options (low strike prices $K$ ) are priced with a very high $\sigma$ .
"At-the-money" options (where $S_t \approx K$ ) are priced with the lowest $\sigma$ .
"Out-of-the-money" call options (high strike prices $K$ ) are priced with a slightly higher $\sigma$ .

If you plot "Implied Volatility ( $\sigma$ )" vs. "Strike Price ( $K$ )," you don't get a flat line. You get a "smile" or, more commonly, a "skew" or "smirk."

The Physical Meaning (Why the Smile Exists):

This "smile" is just a financial map of fear.

Why are low-strike puts so expensive (high $\sigma$ )? A low-strike put is a "crash insurance" bet. Traders are terrified of a 2008-style market crash (a big down move). They are willing to overpay for this insurance, which drives its implied volatility up.
Why are high-strike calls cheaper (lower $\sigma$ )? A high-strike call is a "jackpot" bet on a massive rally. Traders find this less likely than a crash.

The BSM model, with its "perfectly symmetric" $\mathcal{N}(0, 1)$ bell curve, assumes a crash is just as likely as a rally. This is wrong. The market knows that "stocks take the stairs up and the elevator down."

Conclusion: The BSM model's assumption of constant $\sigma$ is a critical bug. It fails to capture the "volatility smile," which is a real, persistent feature of the market.

Part 2: The "Fix" (Volatility is a Random Process)

If $\sigma$ isn't a constant number, what is it?

It's a random process itself. The "jiggle rate" is also "jiggling."

This means our simple, one-equation GBM model is not enough. We need a two-equation model to capture reality:

One SDE for the Stock Price ( $S_t$ ), which is "jiggled" by volatility.
A second SDE for the Volatility ( $\sigma_t$ ) itself, which is also jiggling.

This is the concept of Stochastic Volatility.

The most famous and widely-used stochastic volatility model is the Heston Model (1993).

Part 3: The "How" (The Heston Model SDEs)

(Note: For mathematical reasons, the Heston model tracks the Variance, $v_t$ , (which is $\sigma_t^2$ ) instead of volatility $\sigma_t$ . This handily prevents $\sigma_t$ from ever going negative, since $v_t$ is always positive).

Equation 1: The Stock Price (St) SDE

This is our "magic" risk-neutral GBM, but with one change. We replace the *constant* $\sigma$ with the *random process* $\sqrt{v_t}$ .

dS_t = r S_t dt + \sqrt{v_t} S_t dW_t^1

$r S_t dt$ : The risk-neutral drift (same as before).
$\sqrt{v_t} S_t dW_t^1$ : The new diffusion term. The "jiggle" is now driven by the *current* level of variance, $v_t$ .
$dW_t^1$ : This is our *first* source of randomness ("stock price randomness").

Equation 2: The Variance (vt) SDE (The "Thermostat")

This is the "new engine." What kind of SDE should $v_t$ follow? It must be "mean-reverting." This "pull" is modeled by an SDE:

dv_t = \kappa (\theta - v_t) dt + \xi \sqrt{v_t} dW_t^2

$\kappa (\theta - v_t) dt$ (The "Drift" / "The Pull"): $\theta$ is the long-term average variance, $v_t$ is the current variance, and $\kappa$ is the speed of mean-reversion.
$\xi \sqrt{v_t} dW_t^2$ (The "Diffusion" / "Vol of Vol"): $\xi$ is the "volatility of volatility," and $dW_t^2$ is our *second* source of randomness.

Part 4: The "Secret Sauce" (Correlation ρ)

We now have two random processes driven by two random engines: $dW_t^1$ and $dW_t^2$ . What's the relationship between them? Are they independent?

No. This is the secret of the Heston model. They are correlated.

The Correlation Parameter (ρ)

dW_t^1 \cdot dW_t^2 = \rho dt

$\rho$ (rho) is the correlation coefficient (between -1 and 1). In real markets, for equities, $\rho$ is negative (e.g., -0.7).

This is the "Leverage Effect" and the "Aha!" Moment:

A negative $\rho$ means:

When the stock price $S_t$ suddenly **goes DOWN** (bad news, $dW_t^1$ is negative)...
...the volatility $v_t$ tends to **SPIKE UP** (panic, $dW_t^2$ is positive).

This single, non-zero parameter $\rho$ is what *mathematically creates* the **"skew"** or "smirk" in the volatility smile. It's the "bug fix" for the BSM model. It *agrees* with the market's fear that "down-moves are scarier and more violent than up-moves."

Part 5: The "So What?" (Pros vs. Cons)

Pro: The Heston model is a *far* better model. It can fit the real-world "volatility smile" data, which BSM cannot. This means its prices for out-of-the-money options are much more accurate.
Con: It's *incredibly* complex. We now have a 2D PDE and five parameters to guess ( $\kappa, \theta, \xi, \rho, v_0$ ).
How we solve it: While a complex formula *does* exist, Heston is a perfect candidate for the **Monte Carlo Method (Lesson 6.2)**. Your code just simulates two SDEs instead of one.

What's Next? (The 'Hook')

The Heston model "fixes" the constant $\sigma$ assumption. But it *still* assumes the path for $S_t$ and $v_t$ are **continuous**. It has no "teleporting" (from Lesson 1.2).

But what about a 2008-style crash, a pandemic, or a sudden fraud announcement? The price doesn't "wiggle" down—it **"jumps"** down 20% in an instant.

This is not captured by $W_t$ . We need to add a *new* kind of randomness to our SDE: a "jump process."

This leads to Lesson 6.4: Fixing "No Jumps" - Jump-Diffusion (The Merton Model).

Up Next: Lesson 6.4: Jump-Diffusion Models