Lesson 3.7: General Construction of Confidence Intervals (CIs)
We now begin 'Act III' of our module: Inference. A single-point estimate is our 'best guess,' but it's almost certainly wrong. In this lesson, we learn how to build a Confidence Interval (CI)—a range of plausible values for the true parameter—and, crucially, how to interpret it correctly.
Part 1: From a Point to a Range
So far, we have focused on finding a **point estimate**, like . This is our single best guess for the true parameter.
The problem? It gives us no sense of our **precision** or **uncertainty**. Is the true likely between 1.1 and 1.3? Or is it between -5.0 and 7.4? Our point estimate is the same in both cases, but our confidence in it is vastly different. A Confidence Interval solves this by providing a "margin of error" around our best guess.
Definition: Confidence Interval
A Confidence Interval for a parameter is a random interval, calculated from the sample, which contains the true (unknown) population parameter in of repeated experiments.
The #1 Most Important Interpretation
The meaning of "95% confidence" is the most misunderstood concept in introductory statistics.
WRONG INTERPRETATION:
"There is a 95% probability that the true mean is inside my calculated interval [10, 20]."
CORRECT INTERPRETATION:
"I am 95% confident in the *method* I used to construct this interval. If I were to draw 100 different samples and construct 100 intervals, I expect that 95 of those intervals would capture the true mean ."
The "Fishing Net" Analogy:
The true parameter is a fixed, stationary fish in a lake. Your confidence interval is a fishing net. A "95% confidence level" means you have a method of throwing the net that will succeed in catching the fish 95% of the time. The probability is in your *method*, not in the location of the fish.
Part 2: The Engine of CIs: The Pivotal Method
How do we construct an interval with this "95% capture rate" property? We need a special tool called a **pivotal quantity**.
A "pivot" is a function of our data and the unknown parameter whose own probability distribution is known and **does not depend on the parameter**.
Example: The Z-statistic is the perfect pivot for the mean (when is known).
This quantity follows a distribution regardless of the true value of . This stability is what allows us to build the interval.
Derivation: Building an Interval from the Pivot
The process is a clever algebraic inversion.
Step 1: Start with a probability statement about the pivot. For a 95% interval, we know 95% of Z-statistics will fall between the critical values -1.96 and +1.96.
Step 2: Substitute the formula for the pivot.
Step 3: Isolate the unknown parameter in the middle of the inequality.
Multiply all parts by the standard error:
Subtract from all parts:
Multiply by -1 (which flips the direction of the inequalities):
Step 4: Rearrange to the standard format.
This gives us the lower and upper bounds of our 95% confidence interval.
Part 3: The General Recipe for a Confidence Interval
The General Formula for a Confidence Interval
The structure is almost always the same:
The Three Ingredients
- Point Estimate (): Your single best guess for the parameter (e.g., , ). This is the center of your interval.
- Standard Error (): The estimated standard deviation of your estimator's sampling distribution (e.g., , ). This measures the "shakiness" of your estimate. A smaller standard error leads to a narrower, more precise interval.
- Critical Value ( or ): A number from a known distribution (Z or t) that determines your level of confidence. A higher confidence level (e.g., 99% vs 95%) requires a larger critical value, resulting in a wider interval. This is the "confidence dial."
What's Next? Applying the Recipe
We've now mastered the general theory of how to build a confidence interval using the pivotal method.
In the next lesson, we will apply this general recipe to the two most important parameters we deal with: the population mean () and the population variance (). We will derive their specific CI formulas, paying close attention to which pivot (Z, t, or Chi-Squared) is the right tool for each job.
Up Next: Let's Apply the Recipe: Deriving CIs for Mean and Variance