Lesson 1.1: Introduction to Time Series
Decomposing a time series into its core components to understand its underlying structure.
Part 1: What Makes Time Series Data Special?
In all previous modules, we worked with cross-sectional data. Imagine a snapshot of 1,000 different companies on a single day. The order of the rows in our dataset doesn't matter. Company A is independent of Company B.
Time series data is fundamentally different. It consists of a sequence of observations on a single entity over multiple time periods. Think of the daily stock price of Apple for the last 10 years. Now, the order is the most crucial piece of information. What happened yesterday directly influences what might happen today.
The Defining Characteristic: Temporal Dependence
The core feature that separates time series analysis from other statistical fields is **temporal dependence** (also known as autocorrelation). The value of the series at one point in time is statistically related to its past values.
This "memory" is both a challenge and an opportunity:
- The Challenge: It violates the classical assumption of independent observations, meaning we cannot use standard OLS regression without careful consideration.
- The Opportunity: If the past influences the future, we can build models to forecast it. The "memory" is the signal we will learn to model.
Part 2: The Anatomy of a Time Series
Any time series can be thought of as a combination of four components:
The long-term, underlying direction.
Predictable patterns at fixed intervals (e.g., yearly, weekly).
Long-term waves with no fixed period (e.g., business cycles).
The random, unpredictable noise.
We typically assume these components combine in one of two ways:
Additive vs. Multiplicative Models
1. Additive Model: Used when the magnitude of the seasonal fluctuations is roughly constant over time.
2. Multiplicative Model: Used when the magnitude of the seasonal fluctuations grows or shrinks as the trend level rises or falls. This is very common in financial data.
Part 3: Practical Decomposition in Python
Example: Decomposing CO2 Data
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
# Load the dataset
data = sm.datasets.co2.load_pandas().data
y = data['co2'].resample('MS').mean().ffill()
# Perform seasonal decomposition
# The data's seasonal swings grow over time, so we use 'multiplicative'
decomposition = sm.tsa.seasonal_decompose(y, model='multiplicative')
# Plot the results
fig = decomposition.plot()
plt.suptitle('Multiplicative Decomposition of CO2 Data', y=1.02)
fig.set_size_inches(10, 8)
plt.show()This code produces a plot with four panels showing the original series and its extracted Trend, Seasonal, and Residual components.
What's Next? The Quest for Stability
We have successfully decomposed a time series into its predictable parts and its random part. This is the essential first step of any serious time series analysis.
Why did we do this? Because most of the powerful time series models we are about to learn (like ARMA and ARIMA) have a strict prerequisite: they can only be applied to data that is **stable** and has no trend or seasonality. Such a series is called **stationary**.
In the next lesson, we will formally define and test for Stationarity, the single most important concept in all of time series modeling.