Lesson 3.9: The Logic of Statistical Decisions

We now move from estimation to decision-making. This lesson introduces the formal 'courtroom' logic of hypothesis testing. We'll define the Null (H₀) and Alternative (H₁) hypotheses, and then explore the two types of errors we can make—Type I (α) and Type II (β)—which represent the fundamental tradeoffs in any decision based on data.

Part 1: The Courtroom of Statistics

The framework of hypothesis testing is a direct parallel to a legal trial. Understanding this analogy is the key to mastering the logic.

The Analogy: Innocent Until Proven Guilty

The Null Hypothesis (H₀): This is the default assumption of "innocence." It's the statement of no effect, no change, no difference. We presume it's true unless overwhelmed by evidence. (e.g., "This new drug has no effect.")
The Alternative Hypothesis (H₁): This is the prosecutor's claim. It's what we are trying to find evidence *for*. It's the statement of a real effect. (e.g., "This new drug is effective.")
The Data: This is the evidence presented in court (DNA, witness testimony, etc.).
The Verdict: The jury never declares the defendant "innocent." They either **"Reject H₀"** (find them guilty) or **"Fail to Reject H₀"** (find them not guilty).

The Asymmetry of Hypothesis Testing

Our goal is to see if we have enough evidence to challenge the default belief (H₀). We never "accept" H₀ or "prove" H₁.

Our conclusion is always one of two things:

Reject H₀: The evidence from our sample is so strong that the "no effect" theory looks ridiculous.
Fail to Reject H₀: The evidence was not strong enough. This doesn't mean H₀ is true, just that we couldn't disprove it. (Absence of evidence is not evidence of absence).

Part 2: The Four Possible Outcomes

Because our decision is based on a random sample, it can be wrong. There are exactly four possible outcomes when we make a decision.

The Confusion Matrix

	H₀ is True (Drug is useless)	H₁ is True (Drug is effective)
Our Decision: Reject H₀	Type I Error (α) "False Positive"	Correct Decision (Power) "True Positive"
Our Decision: Fail to Reject H₀	Correct Decision "True Negative"	Type II Error (β) "False Negative"

H₀ is True (Drug is useless)

H₁ is True (Drug is effective)

Our Decision: Reject H₀

Type I Error (α)

"False Positive"

Correct Decision (Power)

"True Positive"

Our Decision: Fail to Reject H₀

Correct Decision

"True Negative"

Type II Error (β)

"False Negative"

Part 3: Defining α, β, and Power

Type I Error: α (alpha)

This is the probability of a "false alarm"—convicting an innocent person.

Significance Level (α)

\alpha = P(\text{Reject } H_0 \mid H_0 \text{ is True})

The researcher **chooses** $\alpha$ before the experiment (usually 5% or 1%). It sets our tolerance for making a false discovery.

Type II Error: β (beta)

This is the probability of a "missed opportunity"—letting a guilty person go free.

Probability of Type II Error (β)

\beta = P(\text{Fail to Reject } H_0 \mid H_1 \text{ is True})

We don't choose $\beta$ directly. It depends on $\alpha$ , the sample size $n$ , and the true effect size.

Statistical Power: 1 - β

Power is the goal of a good statistical test. It's the probability that our test will correctly detect a real effect when there is one.

Definition: Power

\text{Power} = 1 - \beta = P(\text{Reject } H_0 \mid H_1 \text{ is True})

The α / β Tradeoff:

For a fixed sample size, there is a direct tradeoff. If you lower $\alpha$ (make it harder to convict), you will inevitably increase $\beta$ (let more guilty people go free), which reduces the power of your test. The only way to improve both errors simultaneously is to collect more data ( $n \uparrow$ ).

The Payoff: Managing Decision Risk

This framework is how businesses and researchers formally manage the risk of making bad decisions based on data.

A/B Testing: A Type I error means launching a new website feature that doesn't actually work (Cost of False Discovery). A Type II error means failing to launch a feature that would have increased revenue (Cost of Missed Opportunity).
Quantitative Finance: A Type I error means investing real money in a trading strategy that has no real alpha (catastrophic loss). A Type II error means passing on a genuinely profitable strategy (missed profits). Because the cost of a Type I error is so high, quants use very strict $\alpha$ levels.

What's Next? The Mechanics of the Verdict

We've set up the courtroom and defined the possible errors. Now, how does the jury actually reach a verdict? How do we quantify the "strength of the evidence" to decide whether to reject H₀?

In the next lesson, we will learn the practical mechanics of hypothesis testing by defining **test statistics, p-values, and critical regions**.

Up Next: Let's Learn How to Make a Verdict