Lesson 3.12: The Generalized LRT and Wilks' Theorem

The Neyman-Pearson Lemma gave us the 'Most Powerful' test, but only for simple hypotheses. In this final theoretical lesson, we generalize this idea to handle real-world composite hypotheses (e.g., H₀: β₁=β₂=0). We'll build the Generalized Likelihood Ratio Test (LRT) and introduce Wilks' Theorem, which provides a universal test statistic.

Part 1: The Real-World Challenge

The Neyman-Pearson Lemma is elegant, but it requires a simple alternative like H1:μ=12H_1: \mu = 12. Real-world alternatives are composite, like H1:μ10H_1: \mu \ne 10, which contains an infinite number of possible values. Which one should we use in our likelihood ratio?

The 'Handcuffs' Analogy

Imagine comparing two models:

  • The Restricted Model (H₀): This is our model with "handcuffs" on. We force the null hypothesis to be true (e.g., we force β1=0\beta_1=0).
  • The Unrestricted Model (H₁): This is our model with the handcuffs off. The parameters are free to be whatever the data demands.

The Core Question: Does removing the handcuffs *significantly* improve the model's fit to the data?

Part 2: The Generalized Likelihood Ratio (GLR) Test

The solution is to compare the best possible likelihood of the restricted model to the best possible likelihood of the unrestricted model.

Definition: The GLR Statistic (Λ)

The GLR statistic is the ratio of the maximized likelihood under the null (Restricted) to the maximized likelihood under the alternative (Unrestricted).

Λ=maxθH0L(θx)maxθH0H1L(θx)=L(θ^R)L(θ^UR)\Lambda = \frac{\max_{\theta \in H_0} L(\theta | \mathbf{x})}{\max_{\theta \in H_0 \cup H_1} L(\theta | \mathbf{x})} = \frac{L(\hat{\theta}_R)}{L(\hat{\theta}_{UR})}

Interpreting the Ratio:

Since the unrestricted model has more freedom, L(θ^UR)L(θ^R)L(\hat{\theta}_{UR}) \ge L(\hat{\theta}_R), which means 0Λ10 \le \Lambda \le 1.

  • If Λ1\Lambda \approx 1: The handcuffs didn't matter. The restricted model fits almost as well as the unrestricted one. This supports H₀.
  • If Λ0\Lambda \approx 0: The handcuffs were a major problem. The unrestricted model fits the data vastly better. This provides strong evidence against H₀.

Our decision rule is: Reject H₀ if Λ\Lambda is "too small."

Part 3: The Magic Bullet: Wilks' Theorem

We have a test statistic (Λ\Lambda), but finding its exact distribution is nearly impossible. This is where a beautiful asymptotic result, **Wilks' Theorem**, comes to the rescue.

Theorem: Wilks' Theorem (1938)

For large sample sizes (nn \to \infty), under the null hypothesis, the statistic 2ln(Λ)-2 \ln(\Lambda) converges in distribution to a **Chi-squared (χ2\chi^2) distribution**.

2ln(Λ)=2[(θ^R)(θ^UR)]dχq2-2 \ln(\Lambda) = -2 \left[ \ell(\hat{\theta}_{R}) - \ell(\hat{\theta}_{UR}) \right] \xrightarrow{d} \chi^2_q

The degrees of freedom, qq, is the **number of independent restrictions** imposed by H₀. (e.g., for H0:β1=β2=0H_0: \beta_1=\beta_2=0, q=2q=2).

Why this specific transformation?

The 2ln(Λ)-2 \ln(\Lambda) form is clever for two reasons:

  1. It turns the difficult ratio of likelihoods into an easy subtraction of log-likelihoods.
  2. It conveniently flips our decision rule. A small Λ\Lambda (evidence against H₀) corresponds to a **large** 2ln(Λ)-2 \ln(\Lambda). This means our decision rule is now the standard "Reject H₀ if the test statistic is large," which is much more intuitive.
The Grand Unification: The F-test is a Disguised LRT

    This reveals a profound connection between the tools we've learned.

    • The Connection: For OLS models with the assumption of Normal errors, the F-statistic is just a simple mathematical transformation of the Likelihood Ratio statistic, Λ\Lambda. They are two different ways of measuring the exact same thing: the loss of fit from imposing a restriction.

    • Why It Matters: This proves that the F-test, which we motivated by comparing sums of squares, is also the most powerful test from the perspective of likelihood theory. More importantly, Wilks' Theorem shows that the LRT is **more general**. While the F-test is specific to linear models, the LRT can be used to compare *any* nested models (linear, logistic, GARCH, etc.) for which a likelihood can be written, making it the workhorse of modern statistical model comparison.

What's Next? Module 3 Complete!

Congratulations! You have now completed the entire theoretical arc of **Statistical Inference & Estimation Theory**.

You've learned the properties of good estimators, the methods to build them (MoM, MLE), and the complete framework for using them to make decisions (CIs, Hypothesis Testing, p-values, and the theory of optimal tests).

You now possess the foundational knowledge to understand every statistical test and model that follows. In **Module 4**, we will put this entire framework into practice as we build and rigorously test the most important model in all of quantitative analysis: the **Linear Regression Model**.

Up Next: Let's Start Module 4: Simple Linear Regression