Lesson 6.3: Classical Models I - The ARIMA Family
This lesson synthesizes everything we've learned so far. We will formally define the Autoregressive (AR), Moving Average (MA), integrated (I), and combined ARMA/ARIMA models. We'll see how the ACF/PACF plots guide our choice of model order (p,q) and how the ADF test guides our choice of differencing order (d).
Part 1: The 'Memory' Models - AR and MA
The Core Analogy: Rear-View Mirror vs. Ripples
Autoregressive (AR) Model
The Rear-View Mirror. It predicts the future based on past *values* of the series. is a function of .
Moving Average (MA) Model
Ripples in a Pond. It predicts the future based on past *forecast errors* (shocks). is a function of .
Part 2: The ARIMA(p,d,q) Framework
The ARIMA(p,d,q) Model
An Autoregressive Integrated Moving Average model combines all three components into a single, powerful framework. It is the workhorse of classical time series forecasting.
- `p`: The AR order. The number of lagged observations of the series to include. Determined by the **PACF plot** (where it cuts off).
- `d`: The degree of differencing. The number of times the data needs to be differenced to become stationary. Determined by the **ADF test**.
- `q`: The MA order. The number of lagged forecast errors to include. Determined by the **ACF plot** (where it cuts off).
The Modeling Process (Box-Jenkins Methodology):
- Identification: Use the ADF test to find `d`. Then, use ACF/PACF plots on the differenced data to find candidate values for `p` and `q`.
- Estimation: Fit the candidate ARIMA(p,d,q) models to the data.
- Diagnostic Checking: Check if the residuals of your fitted model are white noise. If not, go back to step 1 and try a different model.
What's Next? Modeling Volatility
The ARIMA framework is a complete toolkit for modeling and forecasting the **conditional mean** of a time series.
However, it is built on a crucial assumption that is almost always violated in financial markets: that the variance of the error term, , is constant. This is called **homoskedasticity**.
In reality, financial markets exhibit **volatility clustering**—periods of high volatility are followed by more high volatility, and calm periods are followed by more calm. The variance is not constant; it is time-varying and predictable.
In the next lesson, we will introduce a new class of models, **ARCH and GARCH**, designed specifically to model this conditional heteroskedasticity.