Orthogonality

The profound implications of being at right angles.

In the last lesson, we discovered the dot product and its connection to the angle between two vectors. We saw that a large positive dot product means two vectors are "aligned," while a large negative one means they are "opposed."

But what about the most interesting case? What happens when the dot product is exactly zero?

From our geometric formula, $v \cdot w = \|v\| \|w\| \cos(\theta)$ , the only way for the dot product to be zero (assuming non-zero vectors) is if $\cos(\theta) = 0$ . This happens when the angle $\theta$ is exactly 90 degrees.

Vectors that are at a 90-degree angle to each other are called orthogonal. This is the precise mathematical term for "perpendicular." This simple geometric idea has profound implications in the world of data. In linear algebra, orthogonality is the mathematical embodiment of independence and non-redundancy.

The Power of Zero: Why Orthogonality Matters

When two vectors are orthogonal, moving along one of them has absolutely no effect on your position relative to the other. Think of the cardinal directions: North, South, East, and West.

The direction "East" is orthogonal to the direction "North."
If you walk 10 miles East, how much progress have you made in the Northerly direction? Zero.

The two directions are completely independent. Your movement in one has no component, no "shadow," in the other. This is what a zero dot product signifies. The projection of one vector onto the other is zero.

The Orthogonality Test

To check if two vectors $v$ and $w$ are orthogonal, simply compute their dot product.

If $v \cdot w = 0$ , they are orthogonal.

Let's test $v = [2, 3]$ and $w = [-3, 2]$ .

$v \cdot w = (2 \times -3) + (3 \times 2) = -6 + 6 = 0$

Yes, they are orthogonal! Even though it's not obvious from the numbers, these two vectors form a perfect right angle.

Orthogonality in Data: Non-Redundant Features

Imagine you're building a machine learning model to predict house prices. You have two features:

$f_1$ : Size of the house in square feet.
$f_2$ : Size of the house in square meters.

These two features are almost perfectly correlated. They are redundant. As vectors in "feature space," they would point in almost the exact same direction. Their dot product would be very high.

Now consider two different features:

$f_1$ : Size of the house (sq. feet).
$f_3$ : Number of parks within a 1-mile radius.

These two features are likely to be far more independent. They provide unique, non-overlapping information. As vectors, they would be close to orthogonal.

Why is this important for Quants and ML?

Models, especially regression models, can become unstable and unreliable when their input features are highly correlated (redundant). This problem is called multicollinearity.

By transforming our data into a set of orthogonal basis vectors, we can create a set of perfectly non-redundant, independent features. This is the entire goal of powerful techniques like Principal Component Analysis (PCA).

Orthonormal Bases: The Ideal Coordinate System

We can take the idea of orthogonality one step further. What if we have a set of vectors that are all orthogonal to each other, and they all have a length (L2 norm) of exactly 1? Such a set of vectors is called an orthonormal set.

The standard basis vectors in 2D, $i = [1, 0]$ and $j = [0, 1]$ , are the perfect example.

They are orthogonal: $[1, 0] \cdot [0, 1] = 0$ .
They have a length of 1: $\|[1, 0]\| = 1$ and $\|[0, 1]\| = 1$ .

Working with an orthonormal basis is the ideal scenario in linear algebra. The process of taking a regular basis and turning it into an orthonormal one is called the Gram-Schmidt process, a key algorithm we will visit later.

Summary: The Power of Perpendicular

Orthogonal Vectors: Two vectors are orthogonal if their dot product is exactly zero ( $v \cdot w = 0$ ).
The Meaning: Orthogonality represents independence and non-redundancy.
The Goal: Many advanced algorithms (like PCA and QR Decomposition) are fundamentally about transforming a problem into an orthonormal basis, because it makes the math simpler and the solutions more stable.

Lesson 1.3: The Dot Product, Norms, and Angles

Lesson 1.5: The Two Views of a Matrix