An Introduction to Hypothesis Testing
A practical guide to deciding if your results are a real breakthrough or just random noise.
The Coffee Experiment
We want to know if a new coffee bean actually makes students more alert.
The Trading Algorithm
We need to decide if a new trading algorithm truly performs better than our current one.
On Trial: The Null Hypothesis (H₀)
This is the "skeptic's view" or the "status quo." It assumes there's no real effect or difference.
- ☕ Coffee (H₀): The new coffee has no effect on alertness. Any difference in scores is just random.
- 📈 Trading (H₀): The new algorithm is not better than the old one. Any difference in returns is just market noise.
The Challenger: The Alternative Hypothesis (Hₐ)
This is the new idea we're testing. It's the claim we want to see if we have evidence for.
- ☕ Coffee (Hₐ): The new coffee does increase student alertness.
- 📈 Trading (Hₐ): The new algorithm does generate higher average returns.
Our Goal
We choose how confident we want to be in our conclusion. The standard is 95% confidence. This means we accept there's a 5% risk that we might be wrong. This risk is the Significance Level (Alpha α).
For 95% confidence, α = 1 - 0.95 = 0.05
The Bottom Line: Any result with less than a 5% probability of occurring by random chance will be considered "statistically significant."
Case Result: ☕ The Coffee Experiment
Finding: The "new coffee" group had an average alertness score 10 points higher.
P-Value: 0.02
Verdict: There's only a 2% chance we'd see this result if the coffee had no real effect. Since 0.02 is **less than** our 0.05 significance level, we have a winner!
Conclusion: We reject H₀. The evidence suggests the new coffee really does increase alertness.
Case Result: 📈 The Trading Algorithm
Finding: The new algorithm's average daily return was 0.05% higher.
P-Value: 0.25
Verdict: There is a 25% chance we'd see this result even if the new algorithm was no better than the old one. Since 0.25 is much greater than 0.05, the evidence is weak.
Conclusion: We fail to reject H₀. We don't have enough evidence to invest in the new algorithm.
Type I Error: The False Alarm 🚨
This happens when you **reject the null hypothesis when it was actually true**. You claimed something special was happening, but it was just a fluke.
- Coffee: We buy a massive supply of the "miracle" coffee, but it has no real effect.
- Trading: We switch to the new "genius" algorithm and lose money because its past performance was just luck.
Type II Error: The Missed Opportunity 🤦♂️
This is the opposite: you **fail to reject the null hypothesis when it was actually false**. You missed a real discovery.
- Coffee: We dismiss the new coffee, but it actually worked and we missed out.
- Trading: We don't adopt the new algorithm, but it was genuinely better and we missed out on profits.