Lesson 5.1: The Bedrock of Time Series: Stationarity

This is the most important lesson in this module. We introduce the concept of Stationarity—the assumption that a time series' statistical properties do not change over time. Understanding and verifying stationarity is the mandatory first step for almost every time series forecasting model. Without it, our models are built on quicksand.

Part 1: The Quicksand Problem - Why Most Models Fail on Time Series Data

The statistical tools we have mastered so far, like Ordinary Least Squares (OLS), were designed for cross-sectional data. A core, often unstated, assumption of these models is that the data points are independent and identically distributed (i.i.d.). They assume the "rules of the game" are the same for every observation.

Time series data is fundamentally different. It consists of a sequence of observations on a single entity over multiple time periods. Think of the daily stock price of Apple for the last 10 years. Now, the order is not just important—it is the most crucial piece of information. What happened yesterday directly influences what might happen today.

The Core Analogy: Predicting on Solid Ground vs. Quicksand

Imagine trying to build a forecast model.

  • A **non-stationary** time series is like **quicksand**. The ground is constantly shifting. The average level is changing, the volatility (the ground's shakiness) is changing. Any patterns you learn today might be completely irrelevant tomorrow. A model built on quicksand will make terrible, unreliable predictions.
  • A **stationary** time series is like **solid bedrock**. The ground is stable. The average level is constant, and the shakiness is consistent. Patterns you learn from the past are likely to be relevant for the future. A model built on bedrock is robust and reliable.

Our primary job as time series analysts is to first test if we are on quicksand. If we are, we must find a way to transform our data until we are standing on solid bedrock before we even think about building a forecasting model.

Part 2: What is Stationarity? The Two Levels of 'Sameness'

Stationarity is the formal property of a time series being "solid bedrock." It means that the statistical properties of the series are not a function of the time at which they are observed. There are two levels of rigor for this definition: Strict and Weak.

2.1 Strict Stationarity: The Impossible Ideal

A time series is **strictly stationary** if its entire joint probability distribution is invariant under a shift in time. This means that for any set of time points t1,t2,,tkt_1, t_2, \dots, t_k and any time shift hh, the joint distribution of (Yt1,,Ytk)(Y_{t_1}, \dots, Y_{t_k}) is the same as the joint distribution of (Yt1+h,,Ytk+h)(Y_{t_1+h}, \dots, Y_{t_k+h}).

In simple terms: The entire statistical behavior of the process is identical, no matter when you look at it. The probability of observing any sequence of values is the same in the 1950s as it is in the 2020s. This is an incredibly strong condition. A series of i.i.d. random variables (like rolls of a fair die) is strictly stationary, but almost no real-world financial or economic series meets this criterion.

2.2 Weak Stationarity: The Practical Workhorse

For practical purposes, we almost always use a more relaxed definition called **weak-sense stationarity** (or covariance stationarity). We don't need the entire distribution to be constant, only its first two moments (its mean and its variance/covariance structure).

The Three Conditions for Weak Stationarity

A time series {Yt}\{Y_t\} is weakly stationary if it satisfies these three conditions:

  1. Constant Mean: The expected value of the series is constant for all tt.
    E[Yt]=μ(for all t)E[Y_t] = \mu \quad (\text{for all } t)

    This means the series has no trend. It fluctuates around a constant level.

  2. Constant Variance: The variance of the series is constant and finite for all tt.
    Var(Yt)=E[(Ytμ)2]=σ2<(for all t)\text{Var}(Y_t) = E[(Y_t - \mu)^2] = \sigma^2 < \infty \quad (\text{for all } t)

    This means the series has a constant level of volatility. The fluctuations do not get wider or narrower over time.

  3. Constant Autocovariance: The covariance between any two observations depends only on the time lag kk between them, not on the time tt at which the covariance is calculated.
    Cov(Yt,Ytk)=γk(for all t and any lag k)\text{Cov}(Y_t, Y_{t-k}) = \gamma_k \quad (\text{for all } t \text{ and any lag } k)

    This is the most subtle but important condition. It means the relationship between an observation and its "neighbor" 2 periods ago is the same today as it was 50 years ago. The internal dynamics of the series are stable.

From this point forward in the course, when we say "stationarity," we will be referring to this practical, testable definition of weak stationarity.

Part 3: The Practitioner's Toolkit - How to Test for Stationarity

We have three primary methods for diagnosing stationarity, moving from informal visual checks to formal statistical tests.

Method 1: Visual Inspection (The "Eyeball Test")

The first step is always to plot your data over time. You are looking for obvious violations of the three conditions:

  • Is there a Trend? Does the series appear to be moving consistently upward or downward? If so, the mean is not constant. (Non-stationary)
  • Is there Seasonality? Are there predictable, repeating patterns or cycles? If so, the mean is not constant. (Non-stationary)
  • Is the Variance Changing? Does the "tube" or "band" of fluctuations get wider or narrower? This phenomenon (called heteroskedasticity) means the variance is not constant. (Non-stationary)
Method 2: Summary Statistics (The Rolling Window Test)

We can quantify the visual test by calculating the mean and standard deviation on a rolling window of the data and plotting them. If the series is stationary, these lines should be roughly horizontal.

Method 3: Statistical Hypothesis Tests (The Gold Standard)

A formal test is needed to be rigorous. The most common test is the **Augmented Dickey-Fuller (ADF) test**. The ADF test looks for the presence of a **unit root**. A unit root is the statistical signature of certain types of non-stationary processes, most famously the **Random Walk** (like a stock price).

The Augmented Dickey-Fuller (ADF) Test

The hypotheses for the ADF test are often counter-intuitive:

  • Null Hypothesis (H₀): A unit root is present in the series. The series is **non-stationary**.
  • Alternative Hypothesis (H₁): No unit root is present. The series is **stationary**.

Our goal is to reject the null hypothesis.

The test outputs a p-value. The decision rule is the same as always: "If the p-value is low, the null must go." If we get a p-value below our significance level (e.g., 0.05), we reject H₀ and conclude that our series is stationary.

Part 4: Python Implementation - A Practical Demonstration

Testing for Stationarity in Python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller

# --- Create two time series ---
np.random.seed(42)
# 1. A stationary series (white noise)
stationary_series = pd.Series(np.random.randn(500))

# 2. A non-stationary series (a random walk)
non_stationary_series = pd.Series(np.random.randn(500).cumsum())

# --- Define our testing function ---
def stationarity_test(series, series_name=''):
    print(f'--- Stationarity Test for {series_name} ---')
    
    # Calculate rolling statistics
    rolling_mean = series.rolling(window=30).mean()
    rolling_std = series.rolling(window=30).std()

    # Plot rolling statistics
    plt.figure(figsize=(12, 6))
    plt.plot(series, color='blue', label='Original Series')
    plt.plot(rolling_mean, color='red', label='Rolling Mean')
    plt.plot(rolling_std, color='black', label='Rolling Std')
    plt.legend(loc='best')
    plt.title(f'Rolling Mean & Standard Deviation for {series_name}')
    plt.show()

    # Perform Augmented Dickey-Fuller test
    print('Results of Augmented Dickey-Fuller Test:')
    adf_test = adfuller(series, autolag='AIC')
    adf_output = pd.Series(adf_test[0:4], index=['Test Statistic', 'p-value', '#Lags Used', 'Number of Observations Used'])
    for key, value in adf_test[4].items():
        adf_output[f'Critical Value ({key})'] = value
    print(adf_output)
    
    if adf_output['p-value'] <= 0.05:
        print("\n=> Conclusion: The series is likely stationary (p-value is low).")
    else:
        print("\n=> Conclusion: The series is likely non-stationary (p-value is high).")

# --- Run the tests ---
stationarity_test(stationary_series, 'Stationary Series (White Noise)')
print("\n" + "="*50 + "\n")
stationarity_test(non_stationary_series, 'Non-Stationary Series (Random Walk)')

Part 5: The Payoff - Consequences and Connections

5.1 The Horror of Spurious Regression

What happens if you ignore stationarity and run a regression with non-stationary data? You fall into the trap of **spurious regression**. This is when you find a statistically significant relationship (high R², low p-values on coefficients) between two variables that are, in reality, completely unrelated.

This happens because two variables that are independently trending upwards will appear to be correlated, simply because they are both moving in the same direction over time. The OLS model mistakenly attributes the trend in one variable as explaining the trend in the other. The results are meaningless.

5.2 The Solution: Differencing

The most common method to make a non-stationary series stationary is **differencing**. Instead of modeling the variable's level, we model its change from one period to the next.

ΔYt=YtYt1\Delta Y_t = Y_t - Y_{t-1}

For example, while stock *prices* are non-stationary (they follow a random walk), stock *returns* (their period-to-period differences) are generally stationary. This simple transformation is the foundation of nearly all modern financial time series modeling. This is the "I" (Integrated) in the famous ARIMA model we will study soon.

5.3 Connections to Quant Finance & Machine Learning

Quantitative Finance: Stationarity is non-negotiable. All factor models, risk models (like VaR), and pairs trading strategies require stationary inputs. A quant who builds a model on non-stationary price data will create a model that is unstable, unreliable, and guaranteed to lose money.

Machine Learning: While some tree-based models can be more resilient to non-stationarity, most ML models (from linear regression to neural networks) implicitly assume that the relationships learned from the training set will hold in the future. If you train a model on a trending feature, the model will simply learn the trend. It will be excellent at "predicting" the past but will fail completely when the trend inevitably changes. Feature engineering for time series in ML is often a process of transforming non-stationary features into stationary ones.

Summary: Stationarity is Your License to Model
    • Stationarity means a series' statistical properties (mean, variance, autocovariance) are constant over time. It is the assumption of "solid bedrock."
    • Weak Stationarity is the practical definition we use, requiring constant mean, constant variance, and time-independent autocovariance.
    • Detection Toolkit: We use visual inspection, rolling statistics, and the formal **Augmented Dickey-Fuller (ADF) test**.
    • ADF Test Rule: The null hypothesis is non-stationarity. A low p-value (&gt 0.05) means you can reject the null and conclude the series is stationary.
    • Consequences of Ignoring: Running regressions on non-stationary data leads to **spurious regression**—finding fake relationships.
    • The Fix: The most common way to achieve stationarity is through **differencing** (e.g., converting prices to returns).

What's Next? Exploring the Structure of Memory

We now have the tools to ensure our time series data is on "solid ground." We can confidently work with stationary series.

The next question is: what is the *structure* of this stationary series? How is an observation today related to an observation yesterday, or the day before? How long does the "memory" of a shock persist?

In the next lesson, we will learn about the two most important diagnostic tools for understanding this internal structure: the **Autocorrelation Function (ACF)** and the **Partial Autocorrelation Function (PACF)**.

Up Next: The Detective's Tools: ACF and PACF