Test for Heteroscedastic Heterogeneity

Function to test if the heterogeneity in the true effects is heteroscedastic.

hettest(x, vi, sei, subset, data, method="REML",
        test="score", boot=TRUE, progbar=TRUE, digits, control, ...)

# S3 method for class 'hettest'
print(x, digits=x$digits, ...)

Arguments

x: a vector with the effect size estimates or an object of class "rma".
vi: vector with the corresponding sampling variances. Ignored if x is an object of class "rma".
sei: vector with the corresponding standard errors (note: only one of the two, vi or sei, needs to be specified).
subset: optional (logical or numeric) vector to specify the subset of studies that should be used for the calculation. Ignored if x is an object of class "rma".
data: optional data frame containing the variables given to the arguments above.
method: character string to specify whether the test should be based on maximum likelihood ("ML") or restricted maximum likelihood ("REML") estimation. The default is "REML". Ignored if x is an object of class "rma".
test: character string (case insensitive) to specify which test to run. See ‘Details’.
boot: logical to specify whether bootstrapping should be used (the default is TRUE). Can also be a number to specify the number of bootstrap samples to take.
progbar: logical to specify whether a progress bar should be shown when using bootstrapping (the default is TRUE).
digits: optional integer to specify the number of decimal places to which the printed results should be rounded.
control: optional list of control values. See ‘Note’.
...: other arguments.

Details

The standard random-effects model (that can be fitted with the rma.uni function) assumes that the heterogeneity in the true effects is homoscedastic. This function can be used to test this assumption.

Via argument x, one can either pass a vector with the effect size estimates to the function (and then the corresponding sampling variances via vi or the standard errors via sei) or an object of class "rma". When passing a model object to argument x, the model must be a standard random-effects model that was fitted with either maximum likelihood (ML) or restricted maximum likelihood (REML) estimation.

When x is a vector with the effect size estimates, then argument method can be used to select whether the test should be based on ML or REML estimation. The latter is the default. On the other hand, if x is a model object, then the estimation method that was used for fitting the model determines on which estimation method the test is based.

The test to run is determined by the test argument and can either be:

test="lrt" for a likelihood ratio test,
test="wald" for a Wald-type test,
test="score" for a score test.

The likelihood ratio test compares the fit of the standard random-effects model with a saturated random-effects model where each study has a study-specific \(\tau^2_i\) value. The Wald-type test directly tests all \(k-1\) contrasts between the estimated \(\tau^2_i\) values. Finally, the score test evaluates whether allowing study-specific \(\tau^2_i\) values would improve model fit based on the score vector and the information matrix evaluated under the standard random-effects model, without requiring estimation of the saturated model. This tends to make this test more stable, yielding better control of the Type I error rate, and therefore is the default option.

However, the assumed null distribution for all of these tests (a chi-square distribution with \(k-1\) degrees of freedom) is not correct. Therefore, when boot=TRUE (the default), the function uses parametric bootstrapping to obtain a p-value for a given test. By default, 1000 bootstrap samples are taken, but one can also specify the number of samples to take via the boot argument. When progbar=TRUE (the default), a progress bar is shown to indicate the progress of the bootstrapping.

One can also use the function to carry out several goodness-of-fit tests based on the Pearson residuals obtained from the standard random-effects model (see also the residuals functions, which can compute such residuals). If the assumptions underlying the model are correct (including the assumption that the heterogeneity is homoscedastic), then the Pearson residuals are asymptotically standard normal. Accordingly, one can use the Kolmogorov-Smirnov (KS) test or the Anderson-Darling (AD) test to test if this assumption holds. Alternatively, one can use these test to test if the squared Pearson residuals are chi-square distributed with one degree of freedom. Therefore, additional options for the test argument are:

test="ksn" to test if the Pearson residuals are standard normal using the KS test,
test="ksx2" to test if the squared Pearson residuals are chi-squared distributed with one degree of freedom using the KS test,
test="adn" to test if the Pearson residuals are standard normal using the AD test,
test="adx2" to test if the squared Pearson residuals are chi-squared distributed with one degree of freedom using the AD test.

In the present context, we cannot rely on known properties of these tests to obtain the corresponding p-values. Therefore, bootstrapping is always used for these tests to examine how extreme the observed statistics of the KS and AD tests are under their corresponding bootstrap distributions.

Value

An object of class "hettest". The object is a list containing the following components:

statistic: the test statistic.
df: the degrees of freedom.
pval: the p-value.
tau2i: the estimated \(\tau^2_i\) values from the saturated random-effects model.
se.tau2i: the corresponding standard errors (only when running a Wald-type test).
statistic.boot: the bootstrap values of the test statistic (only when boot=TRUE).
...: some additional elements/values.

The results are formatted and printed with the print function.

Note

The saturated random-effects model is fitted with an iterative algorithm, alternating between the computation of the \(\tau^2_i\) estimates and the estimate of \(\mu\). By default, the initial \(\tau^2_i\) values are set equal to the \(\tau^2\) estimate from the standard random-effects model. This process continues for a maximum of 1000 iterations until the change in all of the estimated \(\tau^2_i\) values is smaller than \(10^{-8}\) from one iteration to the next. The starting \(\tau^2_i\) values, threshold, and maximum number of iterations can be adjusted via the control argument by setting control=list(tau2i.init=values, threshold=value, maxiter=value).

When bootstrapping, the iterative algorithm must be run for each bootstrap sample. In rare cases, the algorithm may not converge. The output shows for how many bootstrap samples the calculations could be completed (this should usually be ‘1000 / 1000’).

Author

Wolfgang Viechtbauer (wvb@metafor-project.org, https://www.metafor-project.org).

Examples

### copy data from Bangert-Drowns et al. (2004) into 'dat'
dat <- dat.bangertdrowns2004

### fit a standard random-effects model
res <- rma(yi, vi, data=dat)
res
#> 
#> Random-Effects Model (k = 48; tau^2 estimator: REML)
#> 
#> tau^2 (estimated amount of total heterogeneity): 0.0499 (SE = 0.0197)
#> tau (square root of estimated tau^2 value):      0.2235
#> I^2 (total heterogeneity / total variability):   58.37%
#> H^2 (total variability / sampling variability):  2.40
#> 
#> Test for Heterogeneity:
#> Q(df = 47) = 107.1061, p-val < .0001
#> 
#> Model Results:
#> 
#> estimate      se    zval    pval   ci.lb   ci.ub      
#>   0.2219  0.0460  4.8209  <.0001  0.1317  0.3122  *** 
#> 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

### test for heteroscedastic heterogeneity
set.seed(1234)
hettest(res)
#> 
#> Test for Heteroscedastic Heterogeneity
#> 
#> Estimation method: Restricted maximum likelihood
#> Test type:         Score test
#> Bootstrapping:     Yes (1000/1000 iterations)
#> 
#> X^2(df = 47) = 73.2355, p = 0.0720
#>