Lesson 3.8: Applying the Recipe: CIs for Mean and Variance
We now apply the general 'pivotal method' to construct the two most common confidence intervals. We'll see why the t-distribution is the correct tool for the job when estimating a mean, and why the Chi-squared distribution is necessary when estimating variance.
Part 1: Confidence Interval for a Population Mean (μ)
The Real-World Problem: σ is Unknown
In our general derivation, we used the Z-statistic as our pivot. But this pivot, , has a fatal flaw: we almost never know the true population standard deviation, .
The only practical solution is to replace with our sample estimate, . As we learned in Module 2, this substitution changes the distribution of our pivot.
Choosing the Right Tool: The t-distribution
When we use the sample standard deviation in our pivot, we introduce extra uncertainty into our calculation. The t-distribution, with its "fatter tails," is the distribution designed to account for precisely this extra uncertainty.
The correct pivot for the mean when is unknown is the **t-statistic**:
We follow the exact same algebraic inversion from the previous lesson, but we use the t-pivot and its critical values, , which are slightly wider than the Z-values.
Starting with and isolating leads directly to the final formula.
The t-Confidence Interval for the Population Mean μ
This is the most widely used confidence interval in all of science and industry.
Part 2: Confidence Interval for a Population Variance (σ²)
To build an interval for variance, we need a different pivot—one that relates our sample variance () to the true population variance ().
Choosing the Right Tool: The Chi-Squared (χ²) Distribution
As we learned in Module 2, the distribution that governs the behavior of sample variance (under normality) is the Chi-Squared distribution.
The correct pivot for the variance is:
A key feature of the distribution is that it is **not symmetric**. This makes the derivation a bit trickier.
Derivation: The Chi-Squared Interval for σ²
Step 1: Define the probability region. Because the distribution is skewed, we need two different critical values to chop off from each tail.
- Lower Critical Value: (the value with area to its left).
- Upper Critical Value: (the value with area to its right).
Step 2: Isolate . Since is in the denominator, we must invert all parts of the inequality, which **reverses the direction** of the inequalities.
Step 3: Solve for and reorder. Multiply by and then flip the expression to the standard format (lower bound on the left).
The Confidence Interval for the Population Variance σ²
Important: Notice how the *upper* Chi-squared critical value appears in the *lower* bound of the interval, and the "lower" value forms the *upper* bound. This is a direct result of the inversion in Step 2.
- CIs for Regression Coefficients: Every OLS regression output shows a 95% CI for each coefficient . That interval is calculated using this lesson's t-interval formula: . It tells you the range of plausible values for the true effect of that variable.
- CIs for Financial Volatility (): A risk manager needs to know the plausible range for an asset's true volatility. They use the Chi-squared method to find the CI for the variance (), and then simply **take the square root of both ends of the interval** to get the CI for volatility (). This gives a "best case" and "worst case" for risk.
What's Next? From Ranges to Decisions
Confidence intervals are a powerful tool for quantifying our uncertainty about an estimate. They give us a range of plausible values for the truth.
But often, we need to make a firm, binary decision. Is this new drug effective, yes or no? Is this factor's beta equal to zero, yes or no? This requires a more formal decision-making framework.
In the next lesson, we will introduce the language and logic of **Hypothesis Testing**.