Lesson 6.2: Testing for Stationarity: The ADF Test
Visual inspection is useful, but quants need rigor. This lesson introduces the Augmented Dickey-Fuller (ADF) test, the standard statistical procedure for formally testing whether a time series is stationary or if it contains a 'unit root' (like a random walk) and needs to be differenced.
Part 1: The Problem of the Random Walk
As we've discussed, most time series models require stationarity. The most common form of non-stationarity in finance is the **random walk**:
This can be rewritten as an AR(1) model where the coefficient is exactly 1: . When this coefficient is 1, we say the process has a **unit root**. A unit root process has "infinite memory"—the impact of a shock never dies out—which is the source of its non-stationarity.
The job of a unit root test is to determine if this coefficient is statistically distinguishable from 1.
Part 2: The Dickey-Fuller Test
The original Dickey-Fuller test makes this explicit by transforming the AR(1) equation. Subtract from both sides:
where . Now, testing if is the same as testing if .
The Dickey-Fuller Hypotheses
- Null Hypothesis (H₀): . A unit root is present. The series is **non-stationary**.
- Alternative Hypothesis (H₁): . No unit root. The series is **stationary**.
We can estimate this regression using OLS and perform a t-test on . However, under the null hypothesis, the t-statistic does *not* follow a standard t-distribution. It follows a special "Dickey-Fuller distribution," so we must compare our statistic to special critical values.
Part 3: The 'Augmented' Dickey-Fuller (ADF) Test
The basic Dickey-Fuller test assumes the error term is white noise. In reality, it might be serially correlated. The **Augmented** Dickey-Fuller (ADF) test accounts for this by adding lagged values of the dependent variable () to the regression.
The ADF Test Regression
The User's Guide to the ADF Test
In practice, you never have to worry about the details. You just need to know how to use the function and interpret its output.
- Run the Test: Use a library function like `adfuller` from `statsmodels`.
- Examine the p-value: This is the only number you really need to look at.
- Apply the Decision Rule:
- If the **p-value > 0.05**: You **fail to reject** the null hypothesis. Your data has a unit root and is non-stationary. You must difference it.
- If the **p-value <= 0.05**: You **reject** the null hypothesis. Your data is stationary. You are on solid ground and can proceed with modeling.
Part 4: Python Implementation
import pandas as pd
import numpy as np
from statsmodels.tsa.stattools import adfuller
# --- Create a non-stationary random walk ---
np.random.seed(42)
non_stationary_series = pd.Series(np.random.randn(500).cumsum(), name='Random Walk')
# --- Create a stationary series (the first difference) ---
stationary_series = non_stationary_series.diff().dropna()
def perform_adf_test(series, name):
print(f"--- ADF Test Results for: {name} ---")
result = adfuller(series)
print(f'ADF Statistic: {result[0]:.4f}')
print(f'p-value: {result[1]:.4f}')
print('Critical Values:')
for key, value in result[4].items():
print(f' {key}: {value:.4f}')
if result[1] <= 0.05:
print("=> Conclusion: Reject the null hypothesis. The series is stationary.")
else:
print("=> Conclusion: Fail to reject the null hypothesis. The series is non-stationary.")
# Test the non-stationary series
perform_adf_test(non_stationary_series, 'Original Random Walk')
print("\n" + "="*50 + "\n")
# Test the stationary series
perform_adf_test(stationary_series, 'Differenced Random Walk')
What's Next? Building the Models
We now have a rigorous, formal procedure for ensuring our data is stationary. We are finally ready to start building forecasting models.
The next lesson, **Classical Models I**, will introduce the full family of ARIMA models. We will see how the ACF and PACF plots we learned about earlier guide our choice of model structure (the `p` and `q` orders), while the ADF test guides our choice of the differencing order (`d`).