Lesson 1.6: The Quant's Toolbox: Common Discrete Distributions
We now move from general theory to specific, powerful tools. This lesson introduces the three workhorse distributions for discrete data. Think of these as specialized tools in your statistical toolbox, each designed for a specific type of problem involving counts. Mastering them is essential for modeling real-world discrete phenomena in both finance and ML.
Your First Three Tools
Tool #1: The Binomial Distribution - Counting Successes
When to Use the Binomial Tool (The Four Conditions)
Pick up this tool only if your experiment meets ALL four of these conditions:
- A fixed number of trials ().
- Each trial is independent of the others.
- Each trial has only two possible outcomes (e.g., success/failure, up/down, spam/not spam).
- The probability of success () is constant for each trial.
Binomial PMF: B(n, p)
Calculates the probability of getting exactly successes in trials.
Where is the combinations formula "n choose x", which counts the number of ways to arrange the successes.
Example: A new classification model has a known accuracy of 90% (). If we test it on 10 new samples (), what's the probability it gets exactly 8 correct?
Binomial Moments (The Tool's Specs)
Expected Value:
Variance:
For our example, we'd expect correct classifications on average.
In Python
In practice, you'll never calculate this by hand. You'll use a library like SciPy:
from scipy.stats import binom
# P(X=8) for n=10, p=0.9
prob = binom.pmf(k=8, n=10, p=0.9)Tool #2: The Poisson Distribution - Counting Rare Events
When to Use the Poisson Tool
Use this tool when you are counting events that happen independently and at a constant average rate.
- Events occur with a known average rate ().
- Events are independent of each other.
- You are counting the number of events in a fixed interval (e.g., per hour, per page, per km).
Poisson PMF: Pois(λ)
Calculates the probability of events occurring in an interval, given an average rate of .
Example: A high-frequency trading server crashes on average 2 times per year (). What is the probability of having zero crashes in the next year?
Poisson Moments (The Tool's Specs)
A beautifully simple property: the mean and variance are the same!
Expected Value:
Variance:
In Python
from scipy.stats import poisson
# P(X=0) for lambda=2
prob = poisson.pmf(k=0, mu=2)Tool #3: The Geometric Distribution - The Waiting Machine
When to Use the Geometric Tool
Use this tool when the question is about "how long do I have to wait?"
- Trials are independent.
- Each trial has two outcomes (success/failure) with a constant probability of success .
- You are measuring the trial number of the first success.
Geometric PMF: Geo(p)
Calculates the probability that the first success occurs on the -th trial.
This is intuitive: you need failures, followed by one success.
Example: A startup pitches to venture capitalists. The probability of securing funding from any single pitch is 10% (). What is the probability they get their first "yes" on their 5th pitch?
Geometric Moments (The Tool's Specs)
Expected Value:
Variance:
On average, the startup should expect to make pitches before securing funding.
In Python
from scipy.stats import geom
# P(X=5) for p=0.1
prob = geom.pmf(k=5, p=0.1)| Distribution | Key Question It Answers | Parameters |
|---|---|---|
| Binomial | "How many successes in n trials?" | (trials), (prob success) |
| Poisson | "How many events in a fixed interval?" | (average rate) |
| Geometric | "How many trials until the first success?" | (prob success) |
What's Next? The Master Tool for Moments
We've learned the formulas for the mean and variance for these distributions. But where did those formulas come from? How would we calculate higher moments like skewness or kurtosis?
In the next lesson, we introduce a powerful mathematical device called the Moment Generating Function (MGF). It's an elegant "master tool" that can generate all the moments of a distribution for us, and it's essential for the theoretical work that comes later.