Lesson 1.13: Slicing the Probability Landscape
We've built our 3D 'probability map' with joint distributions. Now, we learn how to extract meaningful insights from it. We'll master taking 'shadows' of the map to get Marginal Distributions and taking 'slices' through it to get Conditional Distributions. This is the formal mathematical engine behind all predictive models.
Part 1: The Setup - Our Probability Map
Let's revisit our joint probability table from the last lesson. This table is our entire universe for this example. It represents a credit risk model with two variables: , the borrower's risk category, and , the outcome of the loan.
| Joint PMF | Y = 0 (Default) | Y = 1 (Repaid) | Marginal p(x) |
|---|---|---|---|
| X = 0 (High Risk) | 0.15 | 0.25 | 0.40 |
| X = 1 (Low Risk) | 0.05 | 0.55 | 0.60 |
| Marginal p(y) | 0.20 | 0.80 | 1.00 |
Our goal is to understand how to formally extract the marginal and conditional information from this joint distribution.
Part 2: Marginal Distributions (The Shadows)
Definition: Marginal Distribution (Review)
To find the marginal of X, we sum the joint probabilities across all values of Y.
From our table: The marginal distribution for the Risk Category () is simply the column of row totals on the right:
Part 3: Conditional Distributions (The Slices)
Remember the fundamental definition of conditional probability: . The conditional distribution is a direct extension of this idea.
Definition: Conditional Distribution
The conditional distribution of Y given X=x is the joint distribution divided by the marginal distribution of X.
For continuous variables, it's the same concept:
Example: Calculating the Conditional Distribution of Default
Let's find the distribution of Loan Outcome (Y), GIVEN we know the borrower is High Risk ().
- Isolate the Slice: We only look at the first row of our table, where X=0. The joint probabilities are 0.15 (for Y=0) and 0.25 (for Y=1).
- Find the New Universe: The total probability of this slice is the marginal probability . This is our new denominator.
- Normalize: We divide each joint probability in the slice by the marginal probability of the slice.
The Conditional Distribution
Part 4: The Payoff - Conditional Expectation
Why do we care about conditional distributions? Because they allow us to calculate **conditional expectations**. This is the mathematical definition of a prediction.
The conditional expectation asks: "What is the average value of Y, within the slice where X is fixed at the value x?"
Definition: Conditional Expectation
It's the same expected value formula, but we use the conditional distribution as our probability.
Let's calculate the expected loan outcome (0=Default, 1=Repaid) for each risk category:
- For High Risk (X=0): We use the conditional probabilities we just calculated (0.375 and 0.625).
- For Low Risk (X=1): (First, you would calculate as and )
The expected outcome changes based on the input. This function, , is precisely what a linear regression model tries to estimate!
| Type | Key Question | Analogy |
|---|---|---|
| Joint | What is the probability of (X=x AND Y=y)? | The entire 3D landscape. |
| Marginal | What is the overall probability of (X=x)? | The 2D shadow of the landscape. |
| Conditional | Given X=x, what is the probability of (Y=y)? | A 2D slice through the landscape. |
What's Next? Quantifying the Relationship
We can now fully describe the relationship between two variables using distributions. But it would be useful to have a single number that summarizes the strength and direction of their linear relationship.
The next lesson introduces **Covariance**, a measure of the joint variability of two random variables. It is the crucial ingredient needed to calculate the correlation coefficient and the slope of a regression line.