Lesson 1.4: From Outcomes to Numbers: PMF & CDF
We've mastered events, but Quants and ML models need numbers. This lesson introduces the Random Variable—our translator from real-world outcomes to a numerical value. We then learn the two essential tools for describing the probability of these numbers: the PMF and the CDF.
Part 1: Random Variables and Probability Functions
1.1 Definition of a Random Variable
A Random Variable () is a function that maps every possible outcome in the sample space () to a real number.
- Experiment: Flipping two coins.
- Random Variable (): Let be the "Number of Heads."
- Mapping: HH → 2, HT → 1, TH → 1, TT → 0.
From now on, we focus on the numerical outcomes (0, 1, 2) rather than the abstract events (HH, TT).
1.2 Discrete vs. Continuous
The type of random variable determines which tools we use.
- Discrete Random Variable: Can take only a finite or countably infinite number of values (e.g., counts).
- Example: Number of defaults in a bond portfolio ().
- Tool: Probability Mass Function (PMF).
- Continuous Random Variable: Can take any value within a range (e.g., measurements).
- Example: Daily return of the S&P 500 (any value from to ).
- Tool: Probability Density Function (PDF) (covered in the next lesson).
Part 2: The Probability Mass Function (PMF)
PMF Definition
Example: Two Coin Flips (Revisited)
Let be the Number of Heads.
Outcome (x): 0 (TT) | 1 (HT, TH) | 2 (HH)
PMF (p_X(x)): 0.25 | 0.50 | 0.25
Properties of a PMF:
- for all . (Probabilities are between 0 and 100%).
- . (The sum of probabilities for all possible outcomes must be 1).
Part 3: The Cumulative Distribution Function (CDF)
CDF Definition
The CDF is the sum of all the probabilities from the PMF up to the value . It's a running total.
Example: CDF for a Fair Die Roll
- ...and so on, up to .
For a discrete variable, the CDF is a step function, jumping up at each possible outcome.
3.2 Universal Properties of the CDF
The CDF is mathematically required to satisfy four key properties, regardless of whether the variable is discrete or continuous:
- Range: and . (It starts at 0 and ends at 1).
- Monotonically Non-Decreasing: can never go down. If , then .
- Calculating Intervals: The probability of falling between and is found by subtraction:
- Right-Continuous: The function is always continuous from the right (a minor technical condition for continuous variables).
What's Next?
We have defined the two critical functions (PMF and CDF) for describing discrete random events. We are now ready to derive the formulas for their most important properties: the Mean and the Variance.
In the next lesson, we will formally define and derive the formulas for Expected Value, Variance, and Standard Deviation.