In our last lesson, we discovered the potential dangers of the Normal Equations. The act of computing `AᵀA` can lead to numerical instability, especially when the columns of our matrix `A` are nearly parallel.
The root of the problem is that the columns of `A` can be "badly behaved." They can point in similar directions, creating a skewed and unstable coordinate system.
What if we could fix this? What if we could take any set of basis vectors (like the columns of `A`) and convert them into a perfect basis—one where every vector is orthogonal to every other vector, and every vector has a length of 1?
This is the goal of the Gram-Schmidt Process. It is a beautiful, constructive algorithm that takes a "bad" basis and systematically straightens it out into a "good" orthonormal basis.