misc-recs.Rd
This page documents some recommended practices when working with the metafor package (and more generally when conducting meta-analyses).
When fitting models with the rma.uni
and rma.mv
functions, use of restricted maximum likelihood (REML) estimation is generally recommended. This is also the default setting (i.e., method="REML"
). Various simulation studies have indicated that REML estimation tends to provide approximately unbiased estimates of the amount of heterogeneity (e.g., Langan et al., 2019; Veroniki et al., 2016; Viechtbauer, 2005), or more generally, of the variance components in more complex mixed-effects models (Harville, 1977).
For models fitted with the rma.uni
function, the empirical Bayes / Paule-Mandel estimators (i.e., method="EB"
/ method="PM"
), which can actually be shown to be identical to each other despite their different derivations (Viechtbauer et al., 2015), also have some favorable properties. However, these estimators do not generalize in a straightforward manner to more complex models, such as those that can be fitted with the rma.mv
function.
When fitting models with the rma.uni
function, tests of individual model coefficients and the corresponding confidence intervals are by default (i.e., when test="z"
) based on a standard normal distribution, while the omnibus test is based on a chi-square distribution. These inference methods may not perform nominally (i.e., the Type I error rate of tests and the coverage rate of confidence intervals may deviate from the chosen level), especially when the number of studies, \(k\), is low. Therefore, it is highly recommended to use the method by Hartung (1999), Sidik and Jonkman (2002), and Knapp and Hartung (2003) (the Knapp-Hartung method; also referred to as the Hartung-Knapp-Sidik-Jonkman method) by setting test="knha"
(or equivalently, test="hksj"
). Then tests of individual coefficients and confidence intervals are based on a t-distribution with \(k-p\) degrees of freedom, while the omnibus test then uses an F-distribution with \(m\) and \(k-p\) degrees of freedom (with \(m\) denoting the number of coefficients tested and \(p\) the total number of model coefficients). Various simulation studies have shown that this method works very well in providing tests and confidence intervals with close to nominal performance (e.g., Sánchez-Meca & Marín-Martínez, 2008; Viechtbauer et al., 2015).
Alternatively, one can also conduct permutation tests using the permutest
function. These also perform very well (and are, in a certain sense, ‘exact’ tests), but are computationally expensive.
For models fitted with the rma.mv
and rma.glmm
functions, the Knapp-Hartung method and permutation tests are not available. Instead, one can set test="t"
to also use t- and F-distributions for making inferences (although this does not involve the adjustment to the standard errors of the estimated model coefficients that is made as part of the Knapp-Hartung method). For rma.mv
, one should also set dfs="contain"
, which uses an improved method for approximating the degrees of freedom of the t- and F-distributions.
Note that test="z"
is the default for the rma.uni
, rma.mv
, and the rma.glmm
functions. While the improved inference methods described above should ideally be the default, changing this now would break backwards compatibility.
Many meta-analyses involve observed outcomes / effect size estimates that cannot be assumed to be independent, because some estimates were computed based on the same sample of subjects (or at least a partially overlapping set). In this case, one should compute the covariances for any pair of estimates that involve (fully or partially) overlapping subjects. Doing so is difficult, but we can often construct an approximate variance-covariance matrix (say \(V\)) of such dependent estimates. This can be done with the vcalc
function (and/or see the rcalc
function when dealing specifically with dependent correlation coefficients). We can then fit a multivariate/multilevel model to the estimates with the rma.mv
function, using \(V\) as the approximate var-cov matrix of the estimates and adding fixed and random effects to the model as deemed necessary. However, since \(V\) is often only a rough approximation (and since the random effects structure may not fully capture all dependencies in the underlying true outcomes/effects), we can then apply cluster-robust inference methods (also known as robust variance estimation) to the model. This can be done with the robust
function, which also interfaces with the improved inference methods implemented in the clubSandwich package to obtain the cluster-robust tests and confidence intervals.\(^1\) Finally, we can compute predicted outcomes (with corresponding confidence intervals) and test sets of coefficients or linear combinations thereof using the predict
and anova
functions. See Pustejovsky and Tipton (2022) for a paper describing such a workflow for various cases.
To summarize, the general workflow therefore will often consist of these steps:
# construct/approximate the var-cov matrix of dependent estimates
V <- vcalc(...)
# fit multivariate/multilevel model with appropriate fixed/random effects
res <- rma.mv(yi, V, mods = ~ ..., random = ~ ...)
# apply cluster-robust inference methods (robust variance estimation)
# note: use the improved methods from the clubSandwich package
sav <- robust(res, cluster = ..., clubSandwich = TRUE)
sav
# compute predicted outcomes (with corresponding CIs) as needed
predict(sav, ...)
# test sets of coefficients / linear combinations as needed
anova(sav, ...)
How vcalc
and rma.mv
should be used (and the clustering variable specified for robust
) will depend on the specifics of the application.
See dat.assink2016
, dat.knapp2017
, and dat.tannersmith2016
for some examples illustrating this workflow.
When fitting complex models, it is not guaranteed that all parameters of the model are identifiable (i.e., that there is a unique set of values for the parameters that maximizes the (restricted) likelihood function). For models fitted with the rma.mv
function, this pertains especially to the variance/correlation components of the model (i.e., what is specified via the random
argument). Therefore, it is strongly advised in general to do post model fitting checks to make sure that the likelihood surface around the ML/REML estimates is not flat for some combination of the parameter estimates (which would imply that the estimates are essentially arbitrary). For example, one can plot the (restricted) log-likelihood as a function of each variance/correlation component in the model to make sure that each profile plot shows a clear peak at the corresponding ML/REML estimate. The profile
function can be used for this purpose. See also Raue et al. (2009) for some further discussion of parameter identifiability and the use of profile likelihoods to check for this.
The profile
function should also be used after fitting location-scale models (Viechtbauer & López-López, 2022) with the rma.uni
function and after fitting selection models with the selmodel
function.
\(^1\) In small meta-analyses, the (denominator) degrees of freedom for the approximate t- and F-tests provided by the cluster-robust inference methods might be very low, in which case the tests may not be trustworthy and overly conservative (Joshi et al., 2022). Under these circumstances, one can consider the use of cluster wild bootstrapping (as implemented in the wildmeta package) as an alternative method for making inferences.
Hartung, J. (1999). An alternative method for meta-analysis. Biometrical Journal, 41(8), 901–916. https://doi.org/10.1002/(SICI)1521-4036(199912)41:8<901::AID-BIMJ901>3.0.CO;2-W
Harville, D. A. (1977). Maximum likelihood approaches to variance component estimation and to related problems. Journal of the American Statistical Association, 72(358), 320–338. https://doi.org/10.2307/2286796
Joshi, M., Pustejovsky, J. E., & Beretvas, S. N. (2022). Cluster wild bootstrapping to handle dependent effect sizes in meta-analysis with a small number of studies. Research Synthesis Methods, 13(4), 457–477. https://doi.org/10.1002/jrsm.1554
Knapp, G., & Hartung, J. (2003). Improved tests for a random effects meta-regression with a single covariate. Statistics in Medicine, 22(17), 2693–2710. https://doi.org/10.1002/sim.1482
Langan, D., Higgins, J. P. T., Jackson, D., Bowden, J., Veroniki, A. A., Kontopantelis, E., Viechtbauer, W. & Simmonds, M. (2019). A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses. Research Synthesis Methods, 10(1), 83–98. https://doi.org/10.1002/jrsm.1316
Pustejovsky, J. E. & Tipton, E. (2022). Meta-analysis with robust variance estimation: Expanding the range of working models. Prevention Science, 23, 425–438. https://doi.org/10.1007/s11121-021-01246-3
Raue, A., Kreutz, C., Maiwald, T., Bachmann, J., Schilling, M., Klingmuller, U., & Timmer, J. (2009). Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics, 25(15), 1923–1929. https://doi.org/10.1093/bioinformatics/btp358
Sánchez-Meca, J. & Marín-Martínez, F. (2008). Confidence intervals for the overall effect size in random-effects meta-analysis. Psychological Methods, 13(1), 31–48. https://doi.org/10.1037/1082-989x.13.1.31
Sidik, K. & Jonkman, J. N. (2002). A simple confidence interval for meta-analysis. Statistics in Medicine, 21(21), 3153–3159. https://doi.org/10.1002/sim.1262
Veroniki, A. A., Jackson, D., Viechtbauer, W., Bender, R., Bowden, J., Knapp, G., Kuss, O., Higgins, J. P., Langan, D., & Salanti, G. (2016). Methods to estimate the between-study variance and its uncertainty in meta-analysis. Research Synthesis Methods, 7(1), 55–79. https://doi.org/10.1002/jrsm.1164
Viechtbauer, W. (2005). Bias and efficiency of meta-analytic variance estimators in the random-effects model. Journal of Educational and Behavioral Statistics, 30(3), 261–293. https://doi.org/10.3102/10769986030003261
Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. https://doi.org/10.18637/jss.v036.i03
Viechtbauer, W., López-López, J. A., Sánchez-Meca, J., & Marín-Martínez, F. (2015). A comparison of procedures to test for moderators in mixed-effects meta-regression models. Psychological Methods, 20(3), 360–374. https://doi.org/10.1037/met0000023
Viechtbauer, W., & López-López, J. A. (2022). Location-scale models for meta-analysis. Research Synthesis Methods. 13(6), 697–715. https://doi.org/10.1002/jrsm.1562