Lesson 1.7: Measuring Regression Error (MSE, RMSE, R²)

In this final lesson of our foundational module, we complete our evaluation toolkit by mastering the metrics for regression. We'll learn to quantify how 'close' our predictions are using MSE, RMSE, and R-Squared, with a practical home price prediction example and a full Python implementation.

Part 1: The Goal - Quantifying 'Closeness'

For classification, our metrics revolved around a confusion matrix of "correct" vs. "incorrect" predictions. But for regression, this is meaningless. If a house sold for $500,000 and our model predicts $499,500, is that "wrong"? Technically, yes. But is it useful? Absolutely.

The goal of regression metrics is not to measure if we are right, but to quantify the **magnitude of our error**. We want to know, on average, how far off our predictions are from the true values.

To make this tangible, let's use a simple example: predicting a house's price based on its square footage (`sqft`).

Part 2: The Error-Based Metrics - MSE, RMSE, and MAE

Mean Squared Error (MSE): The Punisher

Business Question: "What is the average of the squared errors of my predictions?"

\text{MSE} = \frac{1}{n}\sum_{i=1}^n (y_i - \hat{y}_i)^2

How it works: For each house in our test set, it calculates the difference between the actual price ( $y_i$ ) and the predicted price ( $\hat{y}_i$ ), squares that difference, and then averages all these squared differences.

Pros & Cons:

Pro: By squaring the error, it heavily penalizes large mistakes. A model that is off by $100k on one house is punished far more than a model that is off by $10k on ten houses. This is often desirable.
Con: The units are not interpretable. If our prices are in dollars, the MSE is in "dollars squared," which has no intuitive meaning.

Root Mean Squared Error (RMSE): The Interpreter

Business Question: "On average, how many dollars is my model's price prediction off by?"

\text{RMSE} = \sqrt{\text{MSE}} = \sqrt{\frac{1}{n}\sum_{i=1}^n (y_i - \hat{y}_i)^2}

How it works: It's simply the square root of the MSE.

Pros & Cons:

Pro: This is the **most popular regression metric** because its units are the same as the target variable. An RMSE of 50,000 means our model's predictions are, on average, off by $50,000. This is highly interpretable for business stakeholders.
Pro: It still retains the property of penalizing large errors due to the underlying squaring.
Con: It is more sensitive to outliers than the Mean Absolute Error (MAE).

Mean Absolute Error (MAE): The Robust One

Business Question: "What is the average absolute difference between my predictions and the real prices?"

\text{MAE} = \frac{1}{n}\sum_{i=1}^n |y_i - \hat{y}_i|

How it works: It takes the absolute value of the error for each prediction and averages them.

Pros & Cons:

Pro: Like RMSE, its units are interpretable (e.g., an error in dollars).
Pro: It is more **robust to outliers** than RMSE. Because it doesn't square the errors, a single massive prediction error will not dominate the metric as much. This is useful if your dataset has extreme, rare outliers that you don't want to overly influence your model's evaluation.

Part 3: The Proportional Metric - R-Squared (R²)

The metrics above give us an absolute measure of error. But what if we want to know how much better our model is than a really simple baseline? R-Squared, also called the **coefficient of determination**, gives us this proportional measure.

Understanding R-Squared

Business Question: "What percentage of the variation in house prices is explained by my model (using square footage)?"

R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2} = 1 - \frac{\text{Model Error (MSE)}}{\text{Baseline Error (Variance)}}

How it works: It compares the error of our sophisticated model (`Model Error`) to the error of a naive "dummy" model that just predicts the average house price (`Baseline Error`) for every single house.

An R² of **0.80** means our model has explained 80% of the variance in the house prices. Its errors are 80% smaller than the errors of the naive average-guessing model.
An R² of **0.0** means our model is no better than just guessing the average price every time.
An R² that is **negative** means our model is actively worse than just guessing the average. This is a sign of a very poor model fit.

Part 4: Regression Metrics in Python

Let's calculate these metrics for our house price prediction model.

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# 1. Create realistic house price data
np.random.seed(42)
# Feature: Square Footage (1000-3500 sqft)
X = np.random.rand(200, 1) * 2500 + 1000
# Label: Price = 50k + $150/sqft + noise
y = 50000 + 150 * X.flatten() + np.random.randn(200) * 40000

# 2. Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 3. Train the model
model = LinearRegression()
model.fit(X_train, y_train)

# 4. Make predictions
y_pred = model.predict(X_test)

# --- Calculate and Interpret Metrics ---
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("--- Regression Model Evaluation ---")
print(f"Mean Squared Error (MSE): \{mse:,.2f\} (in 'dollars squared')")
print(f"Root Mean Squared Error (RMSE): $\{rmse:,.2f\}")
print(f"Mean Absolute Error (MAE): $\{mae:,.2f\}")
print(f"R-Squared (R²): \{r2:.2f\}")

print("\n--- Interpretation ---")
print(f"On average, our model's price predictions are off by about $\{rmse:,.2f\} (the RMSE).")
print(f"The feature 'Square Footage' explains about \{r2:.0%\} of the variation in house prices in our test set.")

Congratulations! You Have Completed Module 1

You have now built a complete foundational toolkit. This was a challenging but essential module that took you from zero to a competent beginner data scientist.

You have mastered the core concepts, the rules of fair evaluation, the art of data preparation, built your first classification and regression models, and now you have a professional system for scoring them.

What's Next in Your Journey?

We have only scratched the surface of our linear regression model. We've used a single feature, but the real world is complex. To build powerful financial models, we need to incorporate many features at once. It's time to open up the black box.

In **Module 2: Linear Models - The Workhorses of Quant Finance**, we will begin by upgrading our model to **Multiple Linear Regression** and diving deep into the engine of learning itself: **Gradient Descent**.

Lesson 1.6: Our First Scoring System: Accuracy, Confusion Matrix, Precision, Recall, F1-Score

Lesson 2.1: From Simple to Multiple Linear Regression: The Mathematics of Fitting a Plane