For two (nested) models of class "rma.uni" or "rma.mv", the function provides a full versus reduced model comparison in terms of model fit statistics and a likelihood ratio test. When a single model is specified, a Wald-type test of one or more model coefficients or linear combinations thereof is carried out.

# S3 method for rma
anova(object, object2, btt, X, att, Z, rhs, digits, refit=FALSE, ...)

Arguments

object

an object of class "rma.uni" or "rma.mv".

object2

an (optional) object of class "rma.uni" or "rma.mv". Only relevant when conducting a model comparison and likelihood ratio test. See ‘Details’.

btt

optional vector of indices (or list thereof) to specify which coefficients should be included in the Wald-type test. Can also be a string to grep for. See ‘Details’.

X

optional numeric vector or matrix to specify one or more linear combinations of the coefficients in the model that should be tested. See ‘Details’.

att

optional vector of indices (or list thereof) to specify which scale coefficients should be included in the Wald-type test. Can also be a string to grep for. See ‘Details’. Only relevant for location-scale models (see rma.uni).

Z

optional numeric vector or matrix to specify one or more linear combinations of the scale coefficients in the model that should be tested. See ‘Details’. Only relevant for location-scale models (see rma.uni).

rhs

optional scalar or vector of values for the right-hand side of the null hypothesis when testing a set of coefficients (via btt or att) or linear combinations thereof (via X or Z). If unspecified, this defaults to a vector of zeros of the appropriate length. See ‘Details’.

digits

optional integer to specify the number of decimal places to which the printed results should be rounded. If unspecified, the default is to take the value from the object.

refit

logical to indicate whether models fitted with REML estimation and differing in their fixed effects should be refitted with ML estimation when conducting a likelihood ratio test (the default is FALSE).

...

other arguments.

Details

The function can be used in three different ways:

  1. When a single model is specified (via argument object), the function provides a Wald-type test of one or more model coefficients, that is, \[\mbox{H}_0{:}\; \beta_{j \in \texttt{btt}} = 0,\] where \(\beta_{j \in \texttt{btt}}\) is the set of coefficients to be tested (by default whether the set of coefficients is significantly different from zero, but one can specify a different value under the null hypothesis via argument rhs).

    In particular, for equal- or random-effects models (i.e., models without moderators), this is just the test of the single coefficient of the model (i.e., \(\mbox{H}_0{:}\; \theta = 0\) or \(\mbox{H}_0{:}\; \mu = 0\)). For models including moderators, an omnibus test of all model coefficients is conducted that excludes the intercept (the first coefficient) if it is included in the model. If no intercept is included in the model, then the omnibus test includes all coefficients in the model including the first.

    Alternatively, one can manually specify the indices of the coefficients to test via the btt (‘betas to test’) argument. For example, with btt=c(3,4), only the third and fourth coefficients from the model are included in the test (if an intercept is included in the model, then it corresponds to the first coefficient in the model). Instead of specifying the coefficient numbers, one can specify a string for btt. In that case, grep will be used to search for all coefficient names that match the string (and hence, one can use regular expressions to fine-tune the search for matching strings). Using the btt argument, one can for example select all coefficients corresponding to a particular factor to test if the factor as a whole is significant. One can also specify a list of indices/strings, in which case tests of all list elements will be conducted. See ‘Examples’.

    For location-scale models fitted with the rma.uni function, one can use the att argument in an analogous manner to specify the indices of the scale coefficients to test (i.e., \(\mbox{H}_0{:}\; \alpha_{j \in \texttt{att}} = 0\), where \(\alpha_{j \in \texttt{att}}\) is the set of coefficients to be tested).

  2. When a single model is specified (via argument object), one can use the X argument\(^1\) to specify a linear combination of the coefficients in the model that should be tested using a Wald-type test, that is, \[\mbox{H}_0{:}\; X \beta = 0,\] where X is a (row) vector of the same length as there are coefficients in the model (by default whether the linear combination is significantly different from zero, but one can specify a different value under the null hypothesis via argument rhs). If a matrix of linear combinations is specified, each row defines a particular linear combination to be tested (if rhs is used, then it should either be a scalar or of the same length as the number of combinations to be tested). If the matrix is of full rank, an omnibus Wald-type test of all linear combinations is also provided. Linear combinations can also be obtained with the predict function, which provides corresponding confidence intervals.

    For location-scale models fitted with the rma.uni function, one can use the Z argument in an analogous manner to specify one or multiple linear combinations of the scale coefficients in the model that should be tested (i.e., \(\mbox{H}_0{:}\; Z \alpha = 0\)).

  3. When specifying two models for comparison (via arguments object and object2), the function provides a likelihood ratio test (LRT) comparing the two models. The two models must be based on the same set of data, must be of the same class, and should be nested for the LRT to make sense. Also, LRTs are not meaningful when using REML estimation and the two models differ in terms of their fixed effects (setting refit=TRUE automatically refits the two models using ML estimation). Also, the theory underlying LRTs is only really applicable when comparing models that were fitted with ML/REML estimation, so if some other estimation was used to fit the two models, the results should be treated with caution.


\(^1\) This argument used to be called L, but was renamed to X (but using L in place of X still works).

Value

An object of class "anova.rma". When a single model is specified (without any further arguments or together with the btt or att argument), the object is a list containing the following components:

QM

test statistic of the Wald-type test of the model coefficients.

QMdf

corresponding degrees of freedom.

QMp

corresponding p-value.

btt

indices of the coefficients tested by the Wald-type test.

k

number of outcomes included in the analysis.

p

number of coefficients in the model (including the intercept).

m

number of coefficients included in the Wald-type test.

...

some additional elements/values.

When btt or att was a list, then the object is a list of class "list.anova.rma", where each element is an "anova.rma" object as described above.

When argument X is used, the object is a list containing the following components:

QM

test statistic of the omnibus Wald-type test of all linear combinations.

QMdf

corresponding degrees of freedom.

QMp

corresponding p-value.

hyp

description of the linear combinations tested.

Xb

values of the linear combinations.

se

standard errors of the linear combinations.

zval

test statistics of the linear combinations.

pval

corresponding p-values.

When two models are specified, the object is a list containing the following components:

fit.stats.f

log-likelihood, deviance, AIC, BIC, and AICc for the full model.

fit.stats.r

log-likelihood, deviance, AIC, BIC, and AICc for the reduced model.

parms.f

number of parameters in the full model.

parms.r

number of parameters in the reduced model.

LRT

likelihood ratio test statistic.

pval

corresponding p-value.

QE.f

test statistic of the test for (residual) heterogeneity from the full model.

QE.r

test statistic of the test for (residual) heterogeneity from the reduced model.

tau2.f

estimated \(\tau^2\) value from the full model. NA for "rma.mv" objects.

tau2.r

estimated \(\tau^2\) value from the reduced model. NA for "rma.mv" objects.

R2

amount (in percent) of the heterogeneity in the reduced model that is accounted for in the full model (NA for "rma.mv" objects). This can be regarded as a pseudo \(R^2\) statistic (Raudenbush, 2009). Note that the value may not be very accurate unless \(k\) is large (Lopez-Lopez et al., 2014).

...

some additional elements/values.

The results are formatted and printed with the print function. To format the results as a data frame, one can use the as.data.frame function.

Note

The function can also be used to conduct a likelihood ratio test (LRT) for the amount of (residual) heterogeneity in random- and mixed-effects models. The full model should then be fitted with either method="ML" or method="REML" and the reduced model with method="EE" (or with tau2=0). The p-value for the test is based on a chi-square distribution with 1 degree of freedom, but actually needs to be adjusted for the fact that the parameter (i.e., \(\tau^2\)) falls on the boundary of the parameter space under the null hypothesis (see Viechtbauer, 2007, for more details).

LRTs for variance components in more complex models (as fitted with the rma.mv function) can also be conducted in this manner (see ‘Examples’).

References

Hardy, R. J., & Thompson, S. G. (1996). A likelihood approach to meta-analysis with random effects. Statistics in Medicine, 15(6), 619–629. https://doi.org/10.1002/(sici)1097-0258(19960330)15:6%3C619::aid-sim188%3E3.0.co;2-a

Huizenga, H. M., Visser, I., & Dolan, C. V. (2011). Testing overall and moderator effects in random effects meta-regression. British Journal of Mathematical and Statistical Psychology, 64(1), 1–19. https://doi.org/10.1348/000711010X522687

López-López, J. A., Marín-Martínez, F., Sánchez-Meca, J., Van den Noortgate, W., & Viechtbauer, W. (2014). Estimation of the predictive power of the model in mixed-effects meta-regression: A simulation study. British Journal of Mathematical and Statistical Psychology, 67(1), 30–48. https://doi.org/10.1111/bmsp.12002

Raudenbush, S. W. (2009). Analyzing effect sizes: Random effects models. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 295–315). New York: Russell Sage Foundation.

Viechtbauer, W. (2007). Hypothesis tests for population heterogeneity in meta-analysis. British Journal of Mathematical and Statistical Psychology, 60(1), 29–60. https://doi.org/10.1348/000711005X64042

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. https://doi.org/10.18637/jss.v036.i03

Viechtbauer, W., & López-López, J. A. (2022). Location-scale models for meta-analysis. Research Synthesis Methods. 13(6), 697–715. https://doi.org/10.1002/jrsm.1562

See also

rma.uni and rma.mv for functions to fit models for which likelihood ratio and Wald-type tests can be conducted.

print for the print method and as.data.frame for the method to format the results as a data frame.

Examples

### calculate log risk ratios and corresponding sampling variances
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)

### fit random-effects model
res1 <- rma(yi, vi, data=dat, method="ML")
res1
#> 
#> Random-Effects Model (k = 13; tau^2 estimator: ML)
#> 
#> tau^2 (estimated amount of total heterogeneity): 0.2800 (SE = 0.1443)
#> tau (square root of estimated tau^2 value):      0.5292
#> I^2 (total heterogeneity / total variability):   91.38%
#> H^2 (total variability / sampling variability):  11.60
#> 
#> Test for Heterogeneity:
#> Q(df = 12) = 152.2330, p-val < .0001
#> 
#> Model Results:
#> 
#> estimate      se     zval    pval    ci.lb    ci.ub      
#>  -0.7112  0.1719  -4.1374  <.0001  -1.0481  -0.3743  *** 
#> 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

### fit mixed-effects model with two moderators (absolute latitude and publication year)
res2 <- rma(yi, vi, mods = ~ ablat + year, data=dat, method="ML")
res2
#> 
#> Mixed-Effects Model (k = 13; tau^2 estimator: ML)
#> 
#> tau^2 (estimated amount of residual heterogeneity):     0.0269 (SE = 0.0236)
#> tau (square root of estimated tau^2 value):             0.1640
#> I^2 (residual heterogeneity / unaccounted variability): 38.41%
#> H^2 (unaccounted variability / sampling variability):   1.62
#> R^2 (amount of heterogeneity accounted for):            90.39%
#> 
#> Test for Residual Heterogeneity:
#> QE(df = 10) = 28.3251, p-val = 0.0016
#> 
#> Test of Moderators (coefficients 2:3):
#> QM(df = 2) = 33.9192, p-val < .0001
#> 
#> Model Results:
#> 
#>          estimate       se     zval    pval     ci.lb    ci.ub      
#> intrcpt    6.6048  18.1960   0.3630  0.7166  -29.0587  42.2682      
#> ablat     -0.0308   0.0063  -4.9057  <.0001   -0.0432  -0.0185  *** 
#> year      -0.0032   0.0092  -0.3471  0.7285   -0.0212   0.0148      
#> 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

### Wald-type test of the two moderators
anova(res2)
#> 
#> Test of Moderators (coefficients 2:3):
#> QM(df = 2) = 33.9192, p-val < .0001
#> 

### alternative way of specifying the same test
anova(res2, X=rbind(c(0,1,0), c(0,0,1)))
#> 
#> Hypotheses:             
#> 1: ablat = 0 
#> 2:  year = 0 
#> 
#> Results:
#>    estimate     se    zval   pval 
#> 1:  -0.0308 0.0063 -4.9057 <.0001 
#> 2:  -0.0032 0.0092 -0.3471 0.7285 
#> 
#> Omnibus Test of Hypotheses:
#> QM(df = 2) = 33.9192, p-val < .0001
#> 

### corresponding likelihood ratio test
anova(res1, res2)
#> 
#>         df     AIC     BIC    AICc   logLik     LRT   pval       QE  tau^2      R^2 
#> Full     4 23.2922 25.5520 28.2922  -7.6461                 28.3251 0.0269          
#> Reduced  2 29.3302 30.4601 30.5302 -12.6651 10.0379 0.0066 152.2330 0.2800 90.3948% 
#> 

### Wald-type test of a linear combination
anova(res2, X=c(1,35,1970))
#> 
#> Hypothesis:                                      
#> 1: intrcpt + 35*ablat + 1970*year = 0 
#> 
#> Results:
#>    estimate     se    zval   pval 
#> 1:  -0.7533 0.0837 -9.0039 <.0001 
#> 
#> Test of Hypothesis:
#> QM(df = 1) = 81.0700, p-val < .0001
#> 

### use predict() to obtain the same linear combination (with its CI)
predict(res2, newmods=c(35,1970))
#> 
#>     pred     se   ci.lb   ci.ub   pi.lb   pi.ub 
#>  -0.7533 0.0837 -0.9173 -0.5893 -1.1142 -0.3925 
#> 

### mixed-effects model with three moderators
res3 <- rma(yi, vi, mods = ~ ablat + year + alloc, data=dat, method="ML")
res3
#> 
#> Mixed-Effects Model (k = 13; tau^2 estimator: ML)
#> 
#> tau^2 (estimated amount of residual heterogeneity):     0.0329 (SE = 0.0272)
#> tau (square root of estimated tau^2 value):             0.1814
#> I^2 (residual heterogeneity / unaccounted variability): 33.24%
#> H^2 (unaccounted variability / sampling variability):   1.50
#> R^2 (amount of heterogeneity accounted for):            88.24%
#> 
#> Test for Residual Heterogeneity:
#> QE(df = 8) = 26.2030, p-val = 0.0010
#> 
#> Test of Moderators (coefficients 2:5):
#> QM(df = 4) = 31.2609, p-val < .0001
#> 
#> Model Results:
#> 
#>                  estimate       se     zval    pval     ci.lb    ci.ub      
#> intrcpt          -10.0824  23.9973  -0.4201  0.6744  -57.1162  36.9514      
#> ablat             -0.0277   0.0071  -3.9298  <.0001   -0.0416  -0.0139  *** 
#> year               0.0053   0.0122   0.4371  0.6620   -0.0185   0.0292      
#> allocrandom       -0.2826   0.2540  -1.1127  0.2658   -0.7804   0.2152      
#> allocsystematic   -0.1104   0.2551  -0.4330  0.6650   -0.6104   0.3895      
#> 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

### Wald-type test of the 'alloc' factor
anova(res3, btt=4:5)
#> 
#> Test of Moderators (coefficients 4:5):
#> QM(df = 2) = 1.4674, p-val = 0.4801
#> 

### instead of specifying the coefficient numbers, grep for "alloc"
anova(res3, btt="alloc")
#> 
#> Test of Moderators (coefficients 4:5):
#> QM(df = 2) = 1.4674, p-val = 0.4801
#> 

### specify a list for the 'btt' argument
anova(res3, btt=list(2,3,4:5))
#> 
#>   coefs      QM df   pval 
#> 1     2 15.4434  1 <.0001 
#> 2     3  0.1911  1 0.6620 
#> 3   4:5  1.4674  2 0.4801 
#> 

############################################################################

### an example of doing LRTs of variance components in more complex models
dat <- dat.konstantopoulos2011
res <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat)

### likelihood ratio test of the district-level variance component
res0 <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat, sigma2=c(0,NA))
anova(res, res0)
#> 
#>         df     AIC     BIC    AICc   logLik     LRT   pval       QE 
#> Full     3 21.9174 27.9394 22.3880  -7.9587                578.8640 
#> Reduced  2 37.6910 41.7057 37.9218 -16.8455 17.7736 <.0001 578.8640 
#> 

### likelihood ratio test of the school-level variance component
res0 <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat, sigma2=c(NA,0))
anova(res, res0)
#> 
#>         df     AIC     BIC    AICc   logLik     LRT   pval       QE 
#> Full     3 21.9174 27.9394 22.3880  -7.9587                578.8640 
#> Reduced  2 68.4330 72.4476 68.6637 -32.2165 48.5155 <.0001 578.8640 
#> 

### likelihood ratio test of both variance components simultaneously
res0 <- rma.mv(yi, vi, data=dat)
anova(res, res0)
#> 
#>         df      AIC      BIC     AICc    logLik      LRT   pval       QE 
#> Full     3  21.9174  27.9394  22.3880   -7.9587                 578.8640 
#> Reduced  1 439.7236 441.7310 439.7991 -218.8618 421.8062 <.0001 578.8640 
#> 

############################################################################

### an example illustrating a workflow involving cluster-robust inference
dat <- dat.assink2016

### assume that the effect sizes within studies are correlated with rho=0.6
V <- vcalc(vi, cluster=study, obs=esid, data=dat, rho=0.6)

### fit multilevel model using this approximate V matrix
res <- rma.mv(yi, V, random = ~ 1 | study/esid, data=dat)
res
#> 
#> Multivariate Meta-Analysis Model (k = 100; method: REML)
#> 
#> Variance Components:
#> 
#>             estim    sqrt  nlvls  fixed      factor 
#> sigma^2.1  0.0807  0.2841     17     no       study 
#> sigma^2.2  0.1545  0.3931    100     no  study/esid 
#> 
#> Test for Heterogeneity:
#> Q(df = 99) = 745.4385, p-val < .0001
#> 
#> Model Results:
#> 
#> estimate      se    zval    pval   ci.lb   ci.ub      
#>   0.3678  0.0965  3.8097  0.0001  0.1786  0.5570  *** 
#> 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

### likelihood ratio tests of the two variance components
res0 <- rma.mv(yi, V, random = ~ 1 | study/esid, data=dat, sigma2=c(0,NA))
anova(res, res0)
#> 
#>         df      AIC      BIC     AICc   logLik    LRT   pval       QE 
#> Full     3 151.5334 159.3188 151.7861 -72.7667               745.4385 
#> Reduced  2 157.3625 162.5528 157.4875 -76.6813 7.8291 0.0051 745.4385 
#> 
res0 <- rma.mv(yi, V, random = ~ 1 | study/esid, data=dat, sigma2=c(NA,0))
anova(res, res0)
#> 
#>         df      AIC      BIC     AICc    logLik      LRT   pval       QE 
#> Full     3 151.5334 159.3188 151.7861  -72.7667                 745.4385 
#> Reduced  2 528.5630 533.7532 528.6880 -262.2815 379.0295 <.0001 745.4385 
#> 

### use cluster-robust methods for inferences about the fixed effects
sav <- robust(res, cluster=study, clubSandwich=TRUE)
sav
#> 
#> Multivariate Meta-Analysis Model (k = 100; method: REML)
#> 
#> Variance Components:
#> 
#>             estim    sqrt  nlvls  fixed      factor 
#> sigma^2.1  0.0807  0.2841     17     no       study 
#> sigma^2.2  0.1545  0.3931    100     no  study/esid 
#> 
#> Test for Heterogeneity:
#> Q(df = 99) = 745.4385, p-val < .0001
#> 
#> Number of estimates:   100
#> Number of clusters:    17
#> Estimates per cluster: 1-22 (mean: 5.88, median: 5)
#> 
#> Model Results:
#> 
#> estimate      se¹    tval¹     df¹    pval¹   ci.lb¹   ci.ub¹     
#>   0.3678  0.0970   3.7924   14.53   0.0019   0.1605   0.5750   ** 
#> 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> 1) results based on cluster-robust inference (var-cov estimator: CR2,
#>    approx t-test and confidence interval, df: Satterthwaite approx)
#> 

### examine if 'deltype' is a potential moderator
res <- rma.mv(yi, V, mods = ~ deltype, random = ~ 1 | study/esid, data=dat)
sav <- robust(res, cluster=study, clubSandwich=TRUE)
sav
#> 
#> Multivariate Meta-Analysis Model (k = 100; method: REML)
#> 
#> Variance Components:
#> 
#>             estim    sqrt  nlvls  fixed      factor 
#> sigma^2.1  0.0858  0.2929     17     no       study 
#> sigma^2.2  0.1292  0.3595    100     no  study/esid 
#> 
#> Test for Residual Heterogeneity:
#> QE(df = 97) = 639.0911, p-val < .0001
#> 
#> Number of estimates:   100
#> Number of clusters:    17
#> Estimates per cluster: 1-22 (mean: 5.88, median: 5)
#> 
#> Test of Moderators (coefficients 2:3):¹
#> F(df1 = 2, df2 = 0.88) = 66.6394, p-val = 0.1100
#> 
#> Model Results:
#> 
#>                 estimate      se¹     tval¹    df¹    pval¹    ci.lb¹    ci.ub¹     
#> intrcpt          -0.2902  0.0823   -3.5240    4.2   0.0225   -0.5146   -0.0658    * 
#> deltypegeneral    0.7062  0.0418   16.8854   2.32   0.0018    0.5479    0.8645   ** 
#> deltypeovert      0.4500  0.0665    6.7665    2.1   0.0186    0.1766    0.7235    * 
#> 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> 1) results based on cluster-robust inference (var-cov estimator: CR2,
#>    approx t/F-tests and confidence intervals, df: Satterthwaite approx)
#> 

### note: the (denominator) dfs for the omnibus F-test are very low, so the results
### of this test may not be trustworthy; consider using cluster wild bootstrapping
library(wildmeta)
Wald_test_cwb(res, constraints=constrain_zero(2:3), R=1000, seed=1234)
#>   Test Adjustment CR_type Statistic    R p_val
#> 1  CWB        CR0     CR0   Naive-F 1000 0.347