Negative Binomial Distribution

Modeling the number of trials needed to achieve a specified number of successes.

A Generalization of the Geometric Distribution

The Negative Binomial distribution answers the question: "How many trials will it take to get my rr-th success?" It is a generalization of the Geometric distribution, which is just the special case where r=1r=1.

In finance, a trader might use this to model how many trades it will take to achieve 10 winning trades. A venture capitalist could model how many startups they need to fund to get 3 successful exits.

Interactive Negative Binomial Distribution
Adjust the required number of successes (rr) and the probability (pp) to see how the distribution changes.
Mean (μ\mu): 10.00
Variance (σ2\sigma^2): 10.00

Core Concepts

Probability Mass Function (PMF)
The PMF gives the probability that the r-th success occurs on exactly the k-th trial.
P(X=k)=(k1r1)pr(1p)krP(X=k) = \binom{k-1}{r-1} p^r (1-p)^{k-r}

For the rr-th success to happen on trial kk, two things must be true:

  • In the first k1k-1 trials, there must have been exactly r1r-1 successes. The number of ways this can happen is (k1r1)\binom{k-1}{r-1}.
  • The kk-th trial itself must be a success (with probability pp).
  • The overall probability combines the ways the previous successes could happen with the probabilities of those successes and failures: (k1r1)×pr1×(1p)(k1)(r1)\binom{k-1}{r-1} \times p^{r-1} \times (1-p)^{(k-1)-(r-1)}, and then all of that is multiplied by the probability of the final success on trial kk, which gives pr(1p)krp^r (1-p)^{k-r}.
Mean (μ\mu): 10.00
Variance (σ2\sigma^2): 10.00

Key Derivations

Deriving the Mean and Variance
The moments are most intuitively derived by viewing the Negative Binomial as a sum of Geometric random variables.

Deriving the Expected Value (Mean)

Step 1: Decompose into Geometric Variables

Let XX be the total number of trials to get rr successes. We can think of XX as the sum of rr independent random variables, where each YiY_i is the number of trials to get the next success after the previous one.

X=Y1+Y2++YrX = Y_1 + Y_2 + \dots + Y_r

Each YiY_i follows a Geometric distribution with probability pp. We know from the Geometric distribution page that E[Yi]=1/pE[Y_i] = 1/p.

Step 2: Use Linearity of Expectation

The expectation of a sum is the sum of the expectations.

E[X]=E[Y1+Y2++Yr]=E[Y1]+E[Y2]++E[Yr]E[X] = E[Y_1 + Y_2 + \dots + Y_r] = E[Y_1] + E[Y_2] + \dots + E[Y_r]

Step 3: Sum the Geometric Means

Since each YiY_i has the same mean, we are just adding 1/p1/p to itself rr times.

E[X]=i=1r1p=r1pE[X] = \sum_{i=1}^{r} \frac{1}{p} = r \cdot \frac{1}{p}
Final Mean Formula
E[X]=rpE[X] = \frac{r}{p}

Deriving the Variance

We use the same decomposition as above. The variance of a sum of *independent* random variables is the sum of their variances.

Step 1: Sum the Variances of Geometric Variables

The variance of a Geometric distribution is Var(Yi)=(1p)/p2Var(Y_i) = (1-p)/p^2.

Var(X)=Var(Y1++Yr)=Var(Y1)++Var(Yr)Var(X) = Var(Y_1 + \dots + Y_r) = Var(Y_1) + \dots + Var(Y_r)

Step 2: Final Result

We are adding the same variance to itself rr times.

Var(X)=i=1r1pp2=r1pp2Var(X) = \sum_{i=1}^{r} \frac{1-p}{p^2} = r \cdot \frac{1-p}{p^2}
Final Variance Formula
Var(X)=r(1p)p2Var(X) = \frac{r(1-p)}{p^2}

Applications