A practical guide to deciding if your results are a real breakthrough or just random noise.
We want to know if a new coffee bean actually makes students more alert.
We need to decide if a new trading algorithm truly performs better than our current one.
This is the "skeptic's view" or the "status quo." It assumes there's no real effect or difference.
This is the new idea we're testing. It's the claim we want to see if we have evidence for.
We choose how confident we want to be in our conclusion. The standard is 95% confidence. This means we accept there's a 5% risk that we might be wrong. This risk is the Significance Level (Alpha α).
For 95% confidence, α = 1 - 0.95 = 0.05
The Bottom Line: Any result with less than a 5% probability of occurring by random chance will be considered "statistically significant."
Finding: The "new coffee" group had an average alertness score 10 points higher.
P-Value: 0.02
Verdict: There's only a 2% chance we'd see this result if the coffee had no real effect. Since 0.02 is **less than** our 0.05 significance level, we have a winner!
Conclusion: We reject H₀. The evidence suggests the new coffee really does increase alertness.
Finding: The new algorithm's average daily return was 0.05% higher.
P-Value: 0.25
Verdict: There is a 25% chance we'd see this result even if the new algorithm was no better than the old one. Since 0.25 is much greater than 0.05, the evidence is weak.
Conclusion: We fail to reject H₀. We don't have enough evidence to invest in the new algorithm.
This happens when you **reject the null hypothesis when it was actually true**. You claimed something special was happening, but it was just a fluke.
This is the opposite: you **fail to reject the null hypothesis when it was actually false**. You missed a real discovery.