Lesson 6.4: Classical Models II - Volatility (ARCH & GARCH)

We now move beyond modeling the mean to model the variance. This lesson introduces the groundbreaking ARCH and GARCH models, the industry standard for forecasting volatility. These models are essential for risk management, options pricing, and portfolio optimization.

Part 1: The Stylized Fact of Volatility Clustering

The ARIMA models we've studied assume constant variance of the error term. This is false for most financial data. Instead, financial returns exhibit **volatility clustering**: large changes tend to be followed by large changes, and small changes by small changes.

This means volatility is **autocorrelated** and, therefore, predictable. We need a model for the conditional variance.

Part 2: The ARCH Model - Modeling Shocks

The **Autoregressive Conditional Heteroskedasticity (ARCH)** model, developed by Robert Engle, was the first to formalize this idea. It models today's variance as a function of past squared shocks (residuals).

The ARCH(q) Model

The conditional variance, σt2\sigma_t^2, is modeled as:

σt2=α0+i=1qαiϵti2\sigma_t^2 = \alpha_0 + \sum_{i=1}^q \alpha_i \epsilon_{t-i}^2

This means if yesterday had a large shock (ϵt12\epsilon_{t-1}^2), we forecast a higher variance for today.

Part 3: The GARCH Model - The Workhorse

In practice, ARCH models often require many lags (`q`). The **Generalized ARCH (GARCH)** model is a more parsimonious extension that adds a "memory" of past variance itself.

The GARCH(p,q) Model

The conditional variance is now a function of past shocks AND past variances:

σt2=α0+i=1qαiϵti2+j=1pβjσtj2\sigma_t^2 = \alpha_0 + \sum_{i=1}^q \alpha_i \epsilon_{t-i}^2 + \sum_{j=1}^p \beta_j \sigma_{t-j}^2

The most common model in finance is the **GARCH(1,1)**:

σt2=α0+α1ϵt12+β1σt12\sigma_t^2 = \alpha_0 + \alpha_1 \epsilon_{t-1}^2 + \beta_1 \sigma_{t-1}^2

Today's variance is a weighted average of the long-run average, yesterday's squared shock, and yesterday's variance. The persistence of volatility is captured by β1\beta_1, which is typically high (e.g.&gt 0.85).

What's Next? Machine Learning for Time Series

ARIMA and GARCH are powerful classical models. But they are linear and have strong assumptions. Can we apply our more flexible machine learning models, like Random Forest or XGBoost, to time series forecasting?

Yes, but it requires careful **feature engineering**. In the next lesson, we will learn how to create lagged and rolling window features to transform a time series problem into a standard supervised learning problem that our ML models can solve.