Lesson 2.5: The Box-Jenkins Methodology
A systematic, iterative process for identifying, estimating, and validating ARIMA models.
The Four Phases of Modeling
The Core Analogy: A Detective Solving a Case
- Identification (Gathering Clues): Examine the data, test for stationarity (ADF test), difference if necessary, and then use ACF/PACF plots on the stationary series to form an initial hypothesis about the model's order (p, d, q).
- Estimation (Building a Profile): Fit several candidate ARIMA models (e.g., ARIMA(1,1,1), ARIMA(2,1,0)) to the data using Maximum Likelihood Estimation. Compare their AIC/BIC scores to find the best fit.
- Diagnostic Checking (Verifying the Theory): Examine the residuals of your best model. They must be white noise. Check the ACF plot of the residuals for any significant spikes and use a formal test like the Ljung-Box test. If the residuals have structure, your model is misspecified and you must return to Step 1.
- Forecasting (Predicting the Next Move): Once the model is validated, use it to make out-of-sample forecasts.
What's Next? Modeling Volatility
The ARIMA framework is a complete toolkit for modeling and forecasting the **conditional mean** of a time series.
However, it is built on a crucial assumption that is almost always violated in financial markets: that the variance of the error term, , is constant.
In the next module, we will introduce a new class of models, **ARCH and GARCH**, designed specifically to model this conditional heteroskedasticity, or volatility clustering.