Lesson 1.6: The Quant's Toolbox: Common Discrete Distributions

We now move from general theory to specific, powerful tools. This lesson introduces the three workhorse distributions for discrete data. Think of these as specialized tools in your statistical toolbox, each designed for a specific type of problem involving counts. Mastering them is essential for modeling real-world discrete phenomena in both finance and ML.

Your First Three Tools

1. The Binomial
The Success Counter
2. The Poisson
The Rare Event Counter
3. The Geometric
The Waiting Machine

Tool #1: The Binomial Distribution - Counting Successes

The Job: Counting successes in a fixed number of trials.

When to Use the Binomial Tool (The Four Conditions)

Pick up this tool only if your experiment meets ALL four of these conditions:

  1. A fixed number of trials (nn).
  2. Each trial is independent of the others.
  3. Each trial has only two possible outcomes (e.g., success/failure, up/down, spam/not spam).
  4. The probability of success (pp) is constant for each trial.

Binomial PMF: B(n, p)

Calculates the probability of getting exactly xx successes in nn trials.

P(X=x)=(nx)px(1p)nxP(X = x) = \binom{n}{x} p^x (1-p)^{n-x}

Where (nx)\binom{n}{x} is the combinations formula "n choose x", which counts the number of ways to arrange the successes.

Example: A new classification model has a known accuracy of 90% (p=0.9p=0.9). If we test it on 10 new samples (n=10n=10), what's the probability it gets exactly 8 correct?

P(X=8)=(108)(0.9)8(0.1)108=45(0.430)(0.01)0.1937P(X = 8) = \binom{10}{8} (0.9)^8 (0.1)^{10-8} = 45 \cdot (0.430) \cdot (0.01) \approx 0.1937

Binomial Moments (The Tool's Specs)

Expected Value: E[X]=npE[X] = np

Variance: Var[X]=np(1p)Var[X] = np(1-p)

For our example, we'd expect 10cdot0.9=910 \\cdot 0.9 = 9 correct classifications on average.

In Python

In practice, you'll never calculate this by hand. You'll use a library like SciPy:

from scipy.stats import binom
# P(X=8) for n=10, p=0.9
prob = binom.pmf(k=8, n=10, p=0.9)

Tool #2: The Poisson Distribution - Counting Rare Events

The Job: Counting events over a fixed interval of time or space.

When to Use the Poisson Tool

Use this tool when you are counting events that happen independently and at a constant average rate.

  1. Events occur with a known average rate (λ\lambda).
  2. Events are independent of each other.
  3. You are counting the number of events in a fixed interval (e.g., per hour, per page, per km).

Poisson PMF: Pois(λ)

Calculates the probability of xx events occurring in an interval, given an average rate of λ\lambda.

P(X=x)=eλλxx!P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!}

Example: A high-frequency trading server crashes on average 2 times per year (λ=2\lambda=2). What is the probability of having zero crashes in the next year?

P(X=0)=e2200!=e21=0.1353P(X = 0) = \frac{e^{-2} \cdot 2^0}{0!} = e^{-2} \cdot 1 = 0.1353

Poisson Moments (The Tool's Specs)

A beautifully simple property: the mean and variance are the same!

Expected Value: E[X]=λE[X] = \lambda

Variance: Var[X]=λVar[X] = \lambda

In Python

from scipy.stats import poisson
# P(X=0) for lambda=2
prob = poisson.pmf(k=0, mu=2)

Tool #3: The Geometric Distribution - The Waiting Machine

The Job: Counting the number of trials until the *first* success.

When to Use the Geometric Tool

Use this tool when the question is about "how long do I have to wait?"

  1. Trials are independent.
  2. Each trial has two outcomes (success/failure) with a constant probability of success pp.
  3. You are measuring the trial number of the first success.

Geometric PMF: Geo(p)

Calculates the probability that the first success occurs on the xx-th trial.

P(X=x)=(1p)x1pP(X = x) = (1-p)^{x-1} p

This is intuitive: you need x1x-1 failures, followed by one success.

Example: A startup pitches to venture capitalists. The probability of securing funding from any single pitch is 10% (p=0.1p=0.1). What is the probability they get their first "yes" on their 5th pitch?

P(X=5)=(10.1)510.1=(0.9)40.10.0656P(X = 5) = (1-0.1)^{5-1} \cdot 0.1 = (0.9)^4 \cdot 0.1 \approx 0.0656

Geometric Moments (The Tool's Specs)

Expected Value: E[X]=1/pE[X] = 1/p

Variance: Var[X]=(1p)/p2Var[X] = (1-p)/p^2

On average, the startup should expect to make 1/0.1=101/0.1 = 10 pitches before securing funding.

In Python

from scipy.stats import geom
# P(X=5) for p=0.1
prob = geom.pmf(k=5, p=0.1)
Toolbox Summary: Which Tool to Use?
    DistributionKey Question It AnswersParameters
    Binomial"How many successes in n trials?"nn (trials), pp (prob success)
    Poisson"How many events in a fixed interval?"λ\lambda (average rate)
    Geometric"How many trials until the first success?"pp (prob success)

What's Next? The Master Tool for Moments

We've learned the formulas for the mean and variance for these distributions. But where did those formulas come from? How would we calculate higher moments like skewness or kurtosis?

In the next lesson, we introduce a powerful mathematical device called the Moment Generating Function (MGF). It's an elegant "master tool" that can generate all the moments of a distribution for us, and it's essential for the theoretical work that comes later.