The Central Limit Theorem (CLT)
The surprising and powerful idea that the average of many random things is not random at all, but is in fact predictable.
The Central Limit Theorem is one of the most magical ideas in all of statistics. In essence, it states that if you take a sufficiently large number of random samples from *any* population (no matter how weirdly shaped its distribution is) and calculate the average (or mean) of each sample, the distribution of those averages will be approximately a Normal (bell-shaped) distribution.
Imagine you have a barrel full of tickets with numbers written on them. The numbers could be completely random (Uniform distribution), mostly small numbers with a few huge ones (a skewed distribution), or anything else. The CLT says that if you repeatedly:
- Pull out a handful of tickets (a sample, e.g., n=30).
- Calculate the average of that handful.
- Write down the average and put the tickets back.
...and you do this thousands of times, the histogram of the averages you wrote down will form a beautiful, clean bell curve. This is true even if the original numbers in the barrel had a completely different-looking histogram!
Let be a sequence of independent and identically distributed (i.i.d.) random variables with population mean and finite variance . Let be the sample mean. The CLT states that as , the distribution of the standardized sample mean approaches a standard normal distribution:
This tells us two amazing things about the distribution of the sample means:
- The mean of the sample means will be the same as the original population mean ().
- The standard deviation of the sample means (called the "standard error") will be the original population's standard deviation divided by the square root of the sample size ().
Total Averages Collected: 0 / 1000