Lesson 2.11: Advanced Asymptotics: Slutsky's Theorem & The Delta Method

This final lesson of Module 2 equips us with the advanced tools for manipulating and extending our large-sample results. Slutsky's Theorem provides the rules for combining different types of convergence, while the Delta Method allows us to find the distribution of non-linear functions of our estimators—a critical step for analyzing financial ratios like the Sharpe Ratio.

Part 1: Slutsky's Theorem - The Algebra of Limits

We now have two powerful types of convergence:

  • Convergence in Probability (p\xrightarrow{p}): An estimator collapses to a single constant value (from WLLN).
  • Convergence in Distribution (d\xrightarrow{d}): An estimator's distribution shape converges to another distribution's shape (from CLT).

But what happens when we mix them? For example, how do we find the limit of a statistic where the numerator converges in distribution and the denominator converges in probability? Slutsky's Theorem provides the simple, elegant rulebook.

Theorem: Slutsky's Theorem

Let XnX_n and YnY_n be two sequences of random variables. If:

  1. XnX_n converges in distribution to XX (XndXX_n \xrightarrow{d} X), and
  2. YnY_n converges in probability to a constant cc (YnpcY_n \xrightarrow{p} c).

Then the following are true:

  • Addition: Xn+YndX+cX_n + Y_n \xrightarrow{d} X + c
  • Multiplication: XnYndXcX_n \cdot Y_n \xrightarrow{d} X \cdot c
  • Ratio Rule: Xn/YndX/cX_n / Y_n \xrightarrow{d} X / c, provided c0c \ne 0

Key Application: Justifying the Asymptotic t-test

Slutsky's Theorem is the formal reason why a t-statistic behaves like a Z-statistic for large samples.

tn=Xˉnμsn/n=(Xˉnμσ/n)Xn(σsn)Ynt_n = \frac{\bar{X}_n - \mu}{s_n / \sqrt{n}} = \underbrace{\left( \frac{\bar{X}_n - \mu}{\sigma / \sqrt{n}} \right)}_{X_n} \cdot \underbrace{\left( \frac{\sigma}{s_n} \right)}_{Y_n}
  • By the CLT, the first term XndZN(0,1)X_n \xrightarrow{d} Z \sim \mathcal{N}(0,1).
  • By the WLLN, the sample std. dev. snpσs_n \xrightarrow{p} \sigma, so the second term Ynpσ/σ=1Y_n \xrightarrow{p} \sigma/\sigma = 1.

Using Slutsky's product rule, tn=XnYndZ1=Zt_n = X_n \cdot Y_n \xrightarrow{d} Z \cdot 1 = Z. This proves that for large nn, the t-distribution converges to the Standard Normal.

Part 2: The Delta Method - A Statistical 'Chain Rule'

Slutsky's Theorem handles ratios and sums. But what if we want to find the variance of a non-linear transformation of an estimator? This is solved by the Delta Method.

2.1 The Problem of Non-Linear Functions

Suppose we estimate the mean return (μ^\hat{\mu}) and volatility (σ^\hat{\sigma}). We want to find the mean and variance of a derived metric, g(μ^)g(\hat{\mu}), where gg is a non-linear function (e.g., g(μ^)=ln(μ^)g(\hat{\mu}) = \ln(\hat{\mu}) or μ^2\hat{\mu}^2).

We know the mean is easy: plim(ln(μ^))=ln(μ)\text{plim}(\ln(\hat{\mu})) = \ln(\mu) (by WLLN).

But what is the variance, Var(g(μ^))\text{Var}(g(\hat{\mu}))? We need a way to approximate the variance using calculus.

The Core Idea: The Delta Method uses a first-order Taylor approximation (a tangent line) to estimate how the variance of an estimator is transformed when you pass it through a non-linear function. It's like a "chain rule" for propagating variance.

The Delta Method (Univariate)

If we have an estimator θ^\hat{\theta} such that n(θ^θ)dN(0,σ2)\sqrt{n}(\hat{\theta} - \theta) \xrightarrow{d} \mathcal{N}(0, \sigma^2), and gg is a differentiable function with g(θ)0g'(\theta) \ne 0, then:

n(g(θ^)g(θ))dN(0,[g(θ)]2σ2)\sqrt{n}(g(\hat{\theta}) - g(\theta)) \xrightarrow{d} \mathcal{N}(0, [g'(\theta)]^2 \sigma^2)

In simpler terms, the asymptotic variance of our new estimator is:

Avar(g(θ^))=[g(θ)]2Avar(θ^)\text{Avar}(g(\hat{\theta})) = [g'(\theta)]^2 \cdot \text{Avar}(\hat{\theta})

2.2 Derivation (Univariate Case)

We assume θ^\hat{\theta} converges to θ\theta and is asymptotically normal. We use the Taylor approximation around θ\theta:

g(θ^)g(θ)+g(θ)(θ^θ)g(\hat{\theta}) \approx g(\theta) + g'(\theta) (\hat{\theta} - \theta)

Subtract g(θ)g(\theta) from both sides:

g(θ^)g(θ)g(θ)(θ^θ)g(\hat{\theta}) - g(\theta) \approx g'(\theta) (\hat{\theta} - \theta)

Square both sides (to find the variance):

[g(θ^)g(θ)]2[g(θ)]2(θ^θ)2[g(\hat{\theta}) - g(\theta)]^2 \approx [g'(\theta)]^2 (\hat{\theta} - \theta)^2

Now, take the expected value of both sides:

E[[g(θ^)g(θ)]2]E[[g(θ)]2(θ^θ)2]\mathbb{E}[[g(\hat{\theta}) - g(\theta)]^2] \approx \mathbb{E}[[g'(\theta)]^2 (\hat{\theta} - \theta)^2]

Since g(θ)g'(\theta) is a constant when evaluated at θ\theta:

Var(g(θ^))[g(θ)]2Var(θ^)\text{Var}(g(\hat{\theta})) \approx [g'(\theta)]^2 \text{Var}(\hat{\theta})

2.3 Multivariate Delta Method (General Case)

For multiple parameters θ=[θ1,θ2,]T\bm{\theta} = [\theta_1, \theta_2, \dots]^T, we use the vector form.

The Multivariate Delta Method

Avar(g(β^))g(β)TΣg(β)\text{Avar}(g(\hat{\bm{\beta}})) \approx \nabla g(\bm{\beta})^T \mathbf{\Sigma} \nabla g(\bm{\beta})

Where:

  • g(β)T\nabla g(\bm{\beta})^T is the gradient vector of gg (a 1×k1 \times k vector of partial derivatives).
  • Σ\mathbf{\Sigma} is the asymptotic covariance matrix of β^\bm{\hat{\beta}} (the full matrix σ2(XTX)1\sigma^2(\mathbf{X}^T\mathbf{X})^{-1} from Lesson 2.5).

Part 3: The Payoff - Standard Error of the Sharpe Ratio

The Problem: Finding the Risk of a Ratio

The Sharpe Ratio is a cornerstone of performance measurement, but it's a non-linear function of two estimated parameters: the mean excess return (μ^\hat{\mu}) and the volatility (σ^\hat{\sigma}).

SR^=g(μ^,σ^)=μ^σ^\widehat{SR} = g(\hat{\mu}, \hat{\sigma}) = \frac{\hat{\mu}}{\hat{\sigma}}

How can we possibly find the standard error of this ratio to test if it's statistically different from zero?

Answer: The **Multivariate Delta Method**. We simply upgrade the derivative g(θ)g'(\theta) to the gradient vector g\nabla g (the vector of partial derivatives).

Solving the Sharpe Ratio Variance

For the Sharpe Ratio g(μ,σ)=μ/σg(\mu, \sigma) = \mu / \sigma:

  1. Find the gradient vector: g=[g/μ,g/σ]T=[1/σ,μ/σ2]T\nabla g = [\partial g / \partial \mu, \partial g / \partial \sigma]^T = [1/\sigma, -\mu/\sigma^2]^T.
  2. Find the covariance matrix Σ\mathbf{\Sigma} of (μ^,σ^)(\hat{\mu}, \hat{\sigma}). (This is more advanced, but can be derived).
  3. Plug these into the Delta Method formula to get the variance of the Sharpe Ratio.

This result is what allows hedge funds and researchers to publish t-statistics for their Sharpe Ratios, providing a rigorous way to assess if their performance is statistically significant.

Congratulations! You Have Completed Module 2

You have now completed a deep dive into the theoretical heart of statistics. This was a challenging but essential module.

You have met the key players—the **Normal, χ², t, and F distributions**—and you have learned the universal laws they obey in large samples—the **WLLN and the CLT**. Finally, you've acquired the advanced tools to manipulate these results with **Slutsky's Theorem and the Delta Method**.

What's Next in Your Journey?

You have all the theoretical tools. It's time to put them to work. **Module 3: Statistical Inference & Estimation Theory** is where we move from theory to the art and science of estimation. We will learn how to derive estimators (Method of Moments, MLE), how to judge them (bias, efficiency), and how to use them to test hypotheses and build confidence intervals.