Lesson 1.4: From Outcomes to Numbers: PMF & CDF

We've mastered events, but Quants and ML models need numbers. This lesson introduces the Random Variable—our translator from real-world outcomes to a numerical value. We then learn the two essential tools for describing the probability of these numbers: the PMF and the CDF.

Part 1: Random Variables and Probability Functions

1.1 Definition of a Random Variable

A Random Variable (XX) is a function that maps every possible outcome in the sample space (Ω\Omega) to a real number.

  • Experiment: Flipping two coins. Ω={HH, HT, TH, TT}\Omega = \{ \text{HH, HT, TH, TT} \}
  • Random Variable (XX): Let XX be the "Number of Heads."
  • Mapping: HH → 2, HT → 1, TH → 1, TT → 0.

From now on, we focus on the numerical outcomes (0, 1, 2) rather than the abstract events (HH, TT).

1.2 Discrete vs. Continuous

The type of random variable determines which tools we use.

  • Discrete Random Variable: Can take only a finite or countably infinite number of values (e.g., counts).
    • Example: Number of defaults in a bond portfolio (0,1,2,0, 1, 2, \dots).
    • Tool: Probability Mass Function (PMF).
  • Continuous Random Variable: Can take any value within a range (e.g., measurements).
    • Example: Daily return of the S&P 500 (any value from 0.10-0.10 to +0.10+0.10).
    • Tool: Probability Density Function (PDF) (covered in the next lesson).

Part 2: The Probability Mass Function (PMF)

The "Exactly Equal To" Function
The PMF, denoted as pX(x)p_X(x), answers the question: "What is the probability that our random variable XX is exactly equal to some value xx?"

PMF Definition

pX(x)=P(X=x)p_X(x) = P(X = x)

Example: Two Coin Flips (Revisited)

Let XX be the Number of Heads.

Outcome (x): 0 (TT) | 1 (HT, TH) | 2 (HH)

PMF (p_X(x)): 0.25 | 0.50 | 0.25

Properties of a PMF:

  1. 0pX(x)10 \le p_X(x) \le 1 for all xx. (Probabilities are between 0 and 100%).
  2. xpX(x)=1\sum_{x} p_X(x) = 1. (The sum of probabilities for all possible outcomes must be 1).

Part 3: The Cumulative Distribution Function (CDF)

The "Less Than or Equal To" Function
The CDF, denoted FX(x)F_X(x), answers the question: "What is the probability that our random variable XX takes on a value less than or equal to xx?"

CDF Definition

FX(x)=P(Xx)=txpX(t)F_X(x) = P(X \le x) = \sum_{t \le x} p_X(t)

The CDF is the sum of all the probabilities from the PMF up to the value xx. It's a running total.

Example: CDF for a Fair Die Roll

  • FX(1)=P(X1)=P(X=1)=1/6F_X(1) = P(X \le 1) = P(X=1) = 1/6
  • FX(2)=P(X2)=P(X=1)+P(X=2)=2/6F_X(2) = P(X \le 2) = P(X=1) + P(X=2) = 2/6
  • FX(3)=P(X3)=3/6F_X(3) = P(X \le 3) = 3/6
  • ...and so on, up to FX(6)=1F_X(6) = 1.

For a discrete variable, the CDF is a step function, jumping up at each possible outcome.

3.2 Universal Properties of the CDF

The CDF is mathematically required to satisfy four key properties, regardless of whether the variable is discrete or continuous:

  1. Range: limxFX(x)=0\lim_{x \to -\infty} F_X(x) = 0 and limxFX(x)=1\lim_{x \to \infty} F_X(x) = 1. (It starts at 0 and ends at 1).
  2. Monotonically Non-Decreasing: FX(x)F_X(x) can never go down. If a<ba < b, then P(Xa)P(Xb)P(X \le a) \le P(X \le b).
  3. Calculating Intervals: The probability of XX falling between aa and bb is found by subtraction:
    P(a<Xb)=FX(b)FX(a)P(a < X \le b) = F_X(b) - F_X(a)
  4. Right-Continuous: The function is always continuous from the right (a minor technical condition for continuous variables).

What's Next?

We have defined the two critical functions (PMF and CDF) for describing discrete random events. We are now ready to derive the formulas for their most important properties: the Mean and the Variance.

In the next lesson, we will formally define and derive the formulas for Expected Value, Variance, and Standard Deviation.