Lesson 1.4: From Outcomes to Numbers: PMF & CDF

We've mastered events, but Quants and ML models need numbers. This lesson introduces the Random Variable—our translator from real-world outcomes to a numerical value. We then learn the two essential tools for describing the probability of these numbers: the PMF and the CDF.

Part 1: Random Variables and Probability Functions

1.1 Definition of a Random Variable

A Random Variable ( $X$ ) is a function that maps every possible outcome in the sample space ( $\Omega$ ) to a real number.

Experiment: Flipping two coins. $\Omega = \{ \text{HH, HT, TH, TT} \}$
Random Variable ( $X$ ): Let $X$ be the "Number of Heads."
Mapping: HH → 2, HT → 1, TH → 1, TT → 0.

From now on, we focus on the numerical outcomes (0, 1, 2) rather than the abstract events (HH, TT).

1.2 Discrete vs. Continuous

The type of random variable determines which tools we use.

Discrete Random Variable: Can take only a finite or countably infinite number of values (e.g., counts).
- Example: Number of defaults in a bond portfolio ( $0, 1, 2, \dots$ ).
- Tool: Probability Mass Function (PMF).
Continuous Random Variable: Can take any value within a range (e.g., measurements).
- Example: Daily return of the S&P 500 (any value from $-0.10$ to $+0.10$ ).
- Tool: Probability Density Function (PDF) (covered in the next lesson).

Part 2: The Probability Mass Function (PMF)

The "Exactly Equal To" Function

The PMF, denoted as

p_X(x)

, answers the question: "What is the probability that our random variable

X

is exactly equal to some value

x

PMF Definition

p_X(x) = P(X = x)

Example: Two Coin Flips (Revisited)

Let $X$ be the Number of Heads.

Outcome (x): 0 (TT) | 1 (HT, TH) | 2 (HH)

PMF (p_X(x)): 0.25 | 0.50 | 0.25

Properties of a PMF:

$0 \le p_X(x) \le 1$ for all $x$ . (Probabilities are between 0 and 100%).
$\sum_{x} p_X(x) = 1$ . (The sum of probabilities for all possible outcomes must be 1).

Part 3: The Cumulative Distribution Function (CDF)

The "Less Than or Equal To" Function

The CDF, denoted

F_X(x)

, answers the question: "What is the probability that our random variable

X

takes on a value less than or equal to

x

CDF Definition

F_X(x) = P(X \le x) = \sum_{t \le x} p_X(t)

The CDF is the sum of all the probabilities from the PMF up to the value $x$ . It's a running total.

Example: CDF for a Fair Die Roll

$F_X(1) = P(X \le 1) = P(X=1) = 1/6$
$F_X(2) = P(X \le 2) = P(X=1) + P(X=2) = 2/6$
$F_X(3) = P(X \le 3) = 3/6$
...and so on, up to $F_X(6) = 1$ .

For a discrete variable, the CDF is a step function, jumping up at each possible outcome.

3.2 Universal Properties of the CDF

The CDF is mathematically required to satisfy four key properties, regardless of whether the variable is discrete or continuous:

Range: $\lim_{x \to -\infty} F_X(x) = 0$ and $\lim_{x \to \infty} F_X(x) = 1$ . (It starts at 0 and ends at 1).
Monotonically Non-Decreasing: $F_X(x)$ can never go down. If $a < b$ , then $P(X \le a) \le P(X \le b)$ .
Calculating Intervals: The probability of $X$ falling between $a$ and $b$ is found by subtraction:
$P(a < X \le b) = F_X(b) - F_X(a)$
Right-Continuous: The function is always continuous from the right (a minor technical condition for continuous variables).

What's Next?

We have defined the two critical functions (PMF and CDF) for describing discrete random events. We are now ready to derive the formulas for their most important properties: the Mean and the Variance.

In the next lesson, we will formally define and derive the formulas for Expected Value, Variance, and Standard Deviation.

Lesson 1.3: Law of Total Probability and Bayes' Theorem

Lesson 1.5: The Center and The Spread: Expected Value & Variance