Orthogonality

The profound implications of being at right angles.

In the last lesson, we discovered the dot product and its connection to the angle between two vectors. We saw that a large positive dot product means two vectors are "aligned," while a large negative one means they are "opposed."

But what about the most interesting case? What happens when the dot product is exactly zero?

From our geometric formula, vw=vwcos(θ)v \cdot w = \|v\| \|w\| \cos(\theta), the only way for the dot product to be zero (assuming non-zero vectors) is if cos(θ)=0\cos(\theta) = 0. This happens when the angle θ\theta is exactly 90 degrees.

Vectors that are at a 90-degree angle to each other are called orthogonal. This is the precise mathematical term for "perpendicular."

This simple geometric idea—being at right angles—has profound implications in the world of data. In linear algebra, orthogonality is the mathematical embodiment of independence and non-redundancy.

The Power of Zero: Why Orthogonality Matters

When two vectors are orthogonal, moving along one of them has absolutely no effect on your position relative to the other.

Think of the cardinal directions: North, South, East, and West.

  • The direction "East" is orthogonal to the direction "North."
  • If you walk 10 miles East, how much progress have you made in the Northerly direction? Zero.

The two directions are completely independent. Your movement in one has no component, no "shadow," in the other.

This is what a zero dot product signifies. The projection of one vector onto the other is zero. They don't overlap in any meaningful way.

The Orthogonality Test
To check if two vectors vv and ww are orthogonal, simply compute their dot product.

If vw=0v \cdot w = 0, they are orthogonal.

Let's test v=[2,3]v = [2, 3] and w=[3,2]w = [-3, 2].

vw=(2×3)+(3×2)=6+6=0v \cdot w = (2 \times -3) + (3 \times 2) = -6 + 6 = 0

Yes, they are orthogonal! Even though it's not obvious from the numbers, these two vectors form a perfect right angle.

Orthogonality in Data: Non-Redundant Features
Now, let's translate this geometric idea into the language of data science.

Imagine you're building a machine learning model to predict house prices. You have two features:

  • f1f_1: Size of the house in square feet.
  • f2f_2: Size of the house in square meters.

These two features are almost perfectly correlated. They are redundant. If you know one, you practically know the other. As vectors in "feature space," they would point in almost the exact same direction. Their dot product would be very high.

Now consider two different features:

  • f1f_1: Size of the house (sq. feet).
  • f3f_3: Number of parks within a 1-mile radius.

These two features are likely to be far more independent. Knowing the size of a house tells you very little about the number of nearby parks. They provide unique, non-overlapping information. As vectors, they would be close to orthogonal.

Why is this important for Quants and ML?

Models, especially regression models, can become unstable and unreliable when their input features are highly correlated (redundant). This problem is called multicollinearity.

By transforming our data into a set of orthogonal basis vectors, we can create a set of perfectly non-redundant, independent features. This is the entire goal of powerful techniques like Principal Component Analysis (PCA). PCA finds a new coordinate system for your data where the axes (the new features) are all orthogonal to each other.

Orthonormal Bases: The Ideal Coordinate System

We can take the idea of orthogonality one step further. What if we have a set of vectors that are all orthogonal to each other, and they all have a length (L2 norm) of exactly 1?

Such a set of vectors is called an orthonormal set.

The standard basis vectors in 2D, i=[1,0]i = [1, 0] and j=[0,1]j = [0, 1], are the perfect example.

  • They are orthogonal: [1,0][0,1]=(1×0)+(0×1)=0[1, 0] \cdot [0, 1] = (1 \times 0) + (0 \times 1) = 0.
  • They have a length of 1: [1,0]=1\|[1, 0]\| = 1 and [0,1]=1\|[0, 1]\| = 1.

Working with an orthonormal basis is the ideal scenario in linear algebra. Calculations become incredibly simple and stable. The process of taking a regular basis and turning it into an orthonormal one is called the Gram-Schmidt process, a key algorithm we will visit later.

Summary: The Power of Perpendicular
  • Orthogonal Vectors: Two vectors are orthogonal if they are at a 90° angle to each other.
  • The Test: Their dot product is exactly zero (vw=0v \cdot w = 0).
  • The Meaning: Orthogonality represents independence and non-redundancy. The vectors provide unique information and do not overlap.
  • The Goal: Many advanced algorithms (like PCA and QR Decomposition) are fundamentally about transforming a problem into an orthonormal basis, because it makes the math simpler and the solutions more stable.

Understanding orthogonality is not just about geometry; it's about understanding the structure of your data and the relationships between your features.