Lesson 1.15: The Ultimate Separation: Statistical Independence

This capstone lesson for Module 1 defines the strongest possible form of non-relationship between two variables. We will prove that independence implies zero correlation, but crucially, demonstrate why zero correlation does NOT imply independence. This distinction is vital for avoiding common fallacies in risk management and machine learning.

Part 1: Defining Independence for Random Variables

We've discussed correlation, a measure of *linear* association. But what if two variables have no association of *any* kind, linear or non-linear? This is statistical independence.

The Core Idea: Two random variables, X and Y, are independent if knowing the value of one gives you absolutely no information about the value of the other. The joint distribution simply becomes the product of the marginal distributions.

Definition: Independence of Random Variables

Two random variables $X$ and $Y$ are independent if and only if their joint distribution factors into the product of their marginals.

For Discrete RVs: The JPMF must factor for ALL x, y.
$p_{X,Y}(x, y) = p_X(x) \cdot p_Y(y)$
For Continuous RVs: The JPDF must factor for ALL x, y.
$f_{X,Y}(x, y) = f_X(x) \cdot f_Y(y)$
The Universal Definition (CDF): The most general definition is that the Joint CDF factors into the product of the marginal CDFs.
$F_{X,Y}(x, y) = F_X(x) \cdot F_Y(y)$

Part 2: The Critical Relationship: Independence vs. Correlation

Independence ⇒ Zero Correlation

If two variables are independent, their covariance (and thus correlation) MUST be zero. Let's prove it.

Proof: Independence Implies Zero Covariance

Step 1: Start with the covariance formula. Our goal is to show that $\text{Cov}(X,Y) = 0$ .

\text{Cov}(X,Y) = E[XY] - E[X]E[Y]

Step 2: Show that $E[XY] = E[X]E[Y]$ if X and Y are independent. Let's use the continuous case (the discrete proof is analogous with sums).

E[XY] = \int_{-\infty}^\infty \int_{-\infty}^\infty xy \cdot f_{X,Y}(x,y) \, dx \, dy

Step 3: Apply the definition of independence. If they are independent, $f_{X,Y}(x,y) = f_X(x)f_Y(y)$ .

E[XY] = \int_{-\infty}^\infty \int_{-\infty}^\infty xy \cdot f_X(x) f_Y(y) \, dx \, dy

Step 4: Separate the integrals. We can group the terms involving x and y separately.

E[XY] = \left( \int_{-\infty}^\infty x f_X(x) \, dx \right) \left( \int_{-\infty}^\infty y f_Y(y) \, dy \right)

Step 5: Recognize the definitions of E[X] and E[Y].

E[XY] = E[X] \cdot E[Y]

Step 6: Conclude. Substitute this back into the covariance formula:

\text{Cov}(X,Y) = E[X]E[Y] - E[X]E[Y] = 0

Since the covariance is 0, the correlation must also be 0.

Zero Correlation ⇎ Independence (This is FALSE!)

This is one of the most important distinctions in all of statistics. We will now prove by counterexample that two variables can be perfectly dependent yet have zero correlation.

The Classic Counterexample: Y = X²

Step 1: Define two variables with a perfect non-linear relationship.

Let $X$ be a random variable with a symmetric distribution around 0, such that $E[X]=0$ and $E[X^3]=0$ . A standard Normal variable $X \sim N(0,1)$ is a perfect example.

Now, let $Y = X^2$ . Clearly, Y is completely dependent on X. If you know X, you know Y with 100% certainty.

Step 2: Calculate their covariance.

\text{Cov}(X,Y) = E[XY] - E[X]E[Y]

= E[X \cdot X^2] - E[X]E[X^2] = E[X^3] - E[X]E[X^2]

Step 3: Substitute the known expected values.

Since our distribution for X is symmetric around 0, we know $E[X]=0$ and $E[X^3]=0$ .

\text{Cov}(X,Y) = 0 - (0) \cdot E[X^2] = 0

Conclusion: We have two variables that are perfectly dependent, yet their covariance (and correlation) is zero. This proves that zero correlation only tells us about the absence of a *linear* relationship, not *any* relationship.

The Payoff: Why This Distinction Matters

Quant Finance (i.i.d. Assumption): The most common assumption in financial modeling is that asset returns are "i.i.d" - **independent** and identically distributed. This is a very strong assumption. It implies returns are uncorrelated, but also that there are no non-linear dependencies (like volatility clustering, where a big market move is often followed by another big move). Assuming independence when returns are merely uncorrelated can lead to a severe underestimation of risk.
Machine Learning (Naive Bayes): The "Naive" in the Naive Bayes classifier is the assumption that all input features are **independent** given the class. This allows the joint probability to be simplified into a product of marginal probabilities, making computation tractable. This assumption is almost always false in the real world, but the algorithm often works well anyway. Understanding this assumption is key to knowing the model's theoretical limitations.

Congratulations! You Have Completed Module 1

Take a moment to look back at how far you've come. You started with the simple idea of a set and have built a complete, rigorous foundation for single-variable probability theory.

You have mastered the language of uncertainty: Events, Axioms, Conditional Probability, Random Variables, PMFs, PDFs, CDFs, Expected Value, Variance, MGFs, Joint Distributions, Covariance, Correlation, and finally, Independence.

What's Next in Your Journey?

You have learned the grammar and the rules. Now it's time to meet the famous characters of this language. In **Module 2: Key Distributions & Asymptotic Theory**, we will take a deep dive into the specific distributions that run the world of statistics—the Normal, Chi-Squared, Student's t, and F distributions—and learn the magic of the Central Limit Theorem.

Lesson 1.14: Measuring Relationships: Covariance & Correlation

Lesson 2.0: The King of Distributions: The Normal Curve