The workhorse of quantitative analysis: modeling the relationship between variables.
Linear regression is a technique used to model the relationship between a dependent variable (like a stock's return) and one or more independent variables (like the overall market's return).
Y = β₀ + β₁X + ε
Y
is the dependent variable (what you're trying to predict).X
is the independent variable (the predictor).β₁
(Beta 1) is the slope: how much Y is expected to change for a one-unit change in X.β₀
(Beta 0) is the intercept: the expected value of Y when X is 0.ε
(epsilon) is the error term (residual): the random noise or unexplained part.The "best fit" line isn't just an eyeball estimate. It's found using a method called Ordinary Least Squares (OLS). The goal of OLS is to find the specific values for the slope (β₁) and intercept (β₀) that minimize the sum of the squared residuals.
A residual is the vertical distance between an actual data point and the regression line—it's the error for that specific point. We square these errors so that positive and negative errors don't cancel each other out, and to give more weight to larger errors. OLS finds the one unique line that makes this total squared error as small as possible.
CAPM models a stock's excess return as a function of the overall market's excess return. The slope of this regression line, known as "Beta" (β), measures the stock's systematic risk. A Beta 1 means the stock is more volatile than the market; a Beta 1 means it's less volatile. The intercept, "Alpha" (α), theoretically represents the excess return the stock earns that isn't explained by the market. A positive alpha is the holy grail for portfolio managers.