Wald test
The Wald test is a statistical hypothesis test used to determine whether the coefficients in a statistical model have a statistically significant impact. It is frequently used in generalized linear models (GLMs) like logistic regression and linear regression to test hypotheses about parameter estimates.
The general form of the hypothesis being tested is:
H0: Rβ = r
Where:
- β is the vector of model parameters (coefficients).
- R is a matrix specifying linear combinations of the parameters to be tested.
- r is a vector of values to which the linear combinations are being compared.
In simpler terms, the null hypothesis states that a specific set of coefficients or combinations of coefficients are equal to a certain value (often zero, indicating no effect).
The Wald test statistic is calculated using the following formula:
W = (Rβ̂ - r)T[R Cov(β̂) RT]-1(Rβ̂ - r)
Where:
- β̂ is the vector of estimated model parameters.
- Cov(β̂) is the estimated covariance matrix of the estimated parameters.
Under the null hypothesis, the Wald statistic asymptotically follows a chi-squared distribution with degrees of freedom equal to the rank of the matrix R. A large Wald statistic (and thus a small p-value) indicates evidence against the null hypothesis.
Key Features and Considerations:
-
Asymptotic Test: The Wald test relies on asymptotic theory, meaning its validity depends on having a sufficiently large sample size. In small samples, the test may be unreliable.
-
Parameterization Dependence: The results of the Wald test can be sensitive to the parameterization of the model. Different, but mathematically equivalent, formulations of the same hypothesis can lead to different p-values. This can be problematic, especially when dealing with non-linear models.
-
Alternative Tests: Likelihood ratio tests and score tests are alternative hypothesis testing methods that are often considered more reliable than the Wald test, especially in situations with small sample sizes or complex models. They are generally less sensitive to parameterization.
-
Applications: The Wald test is commonly used to:
- Test the significance of individual predictor variables in a regression model.
- Test whether a set of variables jointly contribute significantly to the model.
- Test more complex hypotheses about the relationships between variables.
-
Limitations: Due to its sensitivity to parameterization and dependence on large sample sizes, the Wald test should be used with caution. It is important to consider alternative testing methods and carefully evaluate the assumptions of the test before drawing conclusions.