The metafor package provides a comprehensive collection of functions for conducting meta-analyses in R. The package can be used to calculate various effect size or outcome measures and then allows the user to fit fixed- and random-effects models to these data. By including study-level variables (‘moderators’) as predictors in these models, mixed-effects meta-regression models can also be fitted. For meta-analyses of \(2 \times 2\) tables, proportions, incidence rates, and incidence rate ratios, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear mixed-effects models (i.e., mixed-effects logistic and Poisson regression models). For non-independent effects/outcomes (e.g., due to correlated sampling errors, correlated true effects or outcomes, or other forms of clustering), the package also provides a function for fitting multilevel/multivariate meta-analytic models.

Various methods are available to assess model fit, to identify outliers and/or influential studies, and for conducting sensitivity analyses (e.g., standardized residuals, Cook's distances, leave-one-out analyses). Advanced techniques for hypothesis testing and obtaining confidence intervals (e.g., for the average effect or outcome or for the model coefficients in a meta-regression model) have also been implemented (e.g., the Knapp and Hartung method, permutation tests, cluster robust inference methods / robust variance estimation).

The package also provides functions for creating forest, funnel, radial (Galbraith), normal quantile-quantile, L'Abbé, Baujat, bubble, and GOSH plots. The presence of funnel plot asymmetry (which may be indicative of publication bias) and its potential impact on the results can be examined via the rank correlation and Egger's regression test, the trim and fill method, and by applying a variety of selection models.

The escalc Function

[escalc] Before a meta-analysis can be conducted, the relevant results from each study must be quantified in such a way that the resulting values can be further aggregated and compared. The escalc function can be used to compute a wide variety of effect size or outcome measures (and the corresponding sampling variances) that are often used in meta-analyses (e.g., risk ratios, odds ratios, risk differences, mean differences, standardized mean differences, response ratios / ratios of means, raw or r-to-z transformed correlation coefficients). Measures for quantifying some outcome for individual groups (e.g., proportions and incidence rates and transformations thereof), measures of change (e.g., raw and standardized mean changes), and measures of variability (e.g., variability ratios and coefficient of variation ratios) are also available.

The rma.uni Function

[rma.uni] The various meta-analytic models that are typically used in practice are special cases of the general linear (mixed-effects) model. The rma.uni function (with alias rma) provides a general framework for fitting such models. The function can be used in conjunction with any of the usual effect size or outcome measures used in meta-analyses (e.g., as computed using the escalc function). The notation and models underlying the rma.uni function are explained below.

For a set of \(i = 1, \ldots, k\) independent studies, let \(y_i\) denote the observed value of the effect size or outcome measure in the \(i\)th study. Let \(\theta_i\) denote the corresponding (unknown) true effect/outcome, such that \[y_i | \theta_i \sim N(\theta_i, v_i).\] In other words, the observed effect sizes or outcomes are assumed to be unbiased and normally distributed estimates of the corresponding true effects/outcomes with sampling variances equal to \(v_i\). The \(v_i\) values are assumed to be known. Depending on the outcome measure used, a bias correction, normalizing, and/or variance stabilizing transformation may be necessary to ensure that these assumptions are (approximately) true (e.g., the log transformation for odds/risk ratios, the bias correction for standardized mean differences, Fisher's r-to-z transformation for correlations; see escalc for more details).

The fixed-effects model conditions on the true effects/outcomes and therefore provides a conditional inference about the \(k\) studies included in the meta-analysis. When using weighted estimation, this implies that the fitted model provides an estimate of \[\bar{\theta}_w = \sum_{i=1}^k w_i \theta_i / \sum_{i=1}^k w_i,\] that is, the weighted average of the true effects/outcomes in the \(k\) studies, with weights equal to \(w_i = 1/v_i\) (this is what is often described as the ‘inverse-variance’ method in the meta-analytic literature). One can also employ an unweighted estimation method, which provides an estimate of the unweighted average of the true effects/outcomes in \(k\) studies, that is, an estimate of \[\bar{\theta}_u = \sum_{i=1}^k \theta_i / k.\]

For weighted estimation, one could also choose to estimate \(\bar{\theta}_w\), where the \(w_i\) values are user-defined weights (inverse-variance weights or unit weights as in unweighted estimation are just special cases). It is up to the user to decide to what extent \(\bar{\theta}_w\) is a meaningful parameter to estimate (regardless of the weights used).

The random-effects model does not condition on the true effects/outcomes. Instead, the \(k\) studies included in the meta-analysis are assumed to be a random sample from a larger population of studies. In rare cases, the studies included in a meta-analysis are actually sampled from a larger collection of studies. More typically, the population of studies is a hypothetical population of an essentially infinite set of studies comprising all of the studies that have been conducted, that could have been conducted, or that may be conducted in the future. We assume that \(\theta_i \sim N(\mu, \tau^2)\), that is, the true effects/outcomes in the population of studies are normally distributed with \(\mu\) denoting the average true effect/outcome and \(\tau^2\) the variance of the true effects/outcomes in the population (\(\tau^2\) is therefore often referred to as the amount of ‘heterogeneity’ in the true effects/outcomes). The random-effects model can also be written as \[y_i = \mu + u_i + \epsilon_i,\] where \(u_i \sim N(0, \tau^2)\) and \(\epsilon_i \sim N(0, v_i)\). The fitted model provides an estimate of \(\mu\) and \(\tau^2\). Consequently, the random-effects model provides an unconditional inference about the average true effect/outcome in the population of studies (from which the \(k\) studies included in the meta-analysis are assumed to be a random sample).

When using weighted estimation in the context of a random-effects model, the model is fitted with weights equal to \(w_i = 1/(\tau^2 + v_i)\), with \(\tau^2\) replaced by its estimate (this is the standard ‘inverse-variance’ method for random-effects models). One can also choose unweighted estimation in the context of the random-effects model or specify user-defined weights, although the parameter that is estimated (i.e., \(\mu\)) remains the same regardless of the estimation method and weights used (as opposed to the fixed-effect model, where the parameter estimated is different for weighted versus unweighted estimation or when using different weights than the standard inverse-variance weights). Since weighted estimation with inverse-variance weights is most efficient, it is usually to be preferred for random-effects models (while in the fixed-effect model case, we must carefully consider whether \(\bar{\theta}_w\) or \(\bar{\theta}_u\) is the more meaningful parameter to estimate).

Contrary to what is often stated in the literature, it is important to realize that the fixed-effects model does not assume that the true effects/outcomes are homogeneous (i.e., that \(\theta_i\) is equal to some common value \(\theta\) in all \(k\) studies). In other words, fixed-effects models provide perfectly valid inferences under heterogeneity, as long as one is restricting these inferences to the set of studies included in the meta-analysis and one realizes that the model does not provide an estimate of \(\theta\), but of \(\bar{\theta}_w\) or \(\bar{\theta}_u\) (depending on the estimation method).

In the special case that the true effects/outcomes are actually homogeneous (the equal-effects case), the distinction between fixed- and random-effects models disappears, since homogeneity implies that \(\mu = \bar{\theta}_w = \bar{\theta}_u \equiv \theta\). However, since there is no infallible method to test whether the true effects/outcomes are really homogeneous or not, a researcher should decide on the type of inference desired before examining the data and choose the model accordingly. In fact, there is nothing wrong with fitting both fixed- and random-effects models to the same data, since these models address different questions (i.e., what was the average effect/outcome in the studies that have been conducted versus what is the average effect/outcome in the larger population of studies?). For further details on the distinction between equal-, fixed-, and random-effects models, see Laird and Mosteller (1990) and Hedges and Vevea (1998).

Study-level variables (often referred to as ‘moderators’) can also be included as predictors in such models, leading to so-called ‘meta-regression’ analyses (to examine whether the effects/outcomes tend to be larger/smaller under certain conditions or circumstances). When including moderator variables in a random-effects model, we obtain a mixed-effects meta-regression model. This model can be written as \[y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \ldots + \beta_{p'} x_{ip'} + u_i + \epsilon_i,\] where \(u_i \sim N(0, \tau^2)\) and \(\epsilon_i \sim N(0, v_i)\) as before and \(x_{ij}\) denotes the value of the \(j\)th moderator variable for the \(i\)th study (letting \(p = p' + 1\) denote the total number of coefficients in the model including the model intercept). Therefore, \(\beta_j\) denotes how the average true effect/outcome changes for a one-unit increase in \(x_{ij}\) and the model intercept \(\beta_0\) denotes the average true effect/outcome when the values of all moderator variables are equal to zero. The value of \(\tau^2\) in the mixed-effects model denotes the amount of ‘residual heterogeneity’ in the true effects/outcomes (i.e., the amount of variability in the true effects/outcomes that is not accounted for by the moderators included in the model).

The rma.mh Function

[rma.mh] The Mantel-Haenszel method provides an alternative approach for fitting fixed-effects models when dealing with studies providing data in the form of \(2 \times 2\) tables or in the form of event counts (i.e., person-time data) for two groups (Mantel & Haenszel, 1959). The method is particularly advantageous when aggregating a large number of studies with small sample sizes (the so-called sparse data or increasing strata case). The Mantel-Haenszel method is implemented in the rma.mh function. It can be used in combination with risk ratios, odds ratios, risk differences, incidence rate ratios, and incidence rate differences.

The rma.peto Function

[rma.peto] Yet another method that can be used in the context of a meta-analysis of \(2 \times 2\) table data is Peto's method (see Yusuf et al., 1985), implemented in the rma.peto function. The method provides an estimate of the (log) odds ratio under a fixed-effects model. The method is particularly advantageous when the event of interest is rare, but see the documentation of the function for some caveats.

The rma.glmm Function

[rma.glmm] Dichotomous outcomes and event counts (based on which one can calculate outcome measures such as odds ratios, incidence rate ratios, proportions, and incidence rates) are often assumed to arise from binomial and Poisson distributed data. Meta-analytic models that are directly based on such distributions are implemented in the rma.glmm function. These models are essentially special cases of generalized linear mixed-effects models (i.e., mixed-effects logistic and Poisson regression models). For \(2 \times 2\) table data, a mixed-effects conditional logistic model (based on the non-central hypergeometric distribution) is also available. Random/mixed-effects models with dichotomous data are often referred to as ‘binomial-normal’ models in the meta-analytic literature. Analogously, for event count data, such models could be referred to as ‘Poisson-normal’ models.

The rma.mv Function

[rma.mv] Standard meta-analytic models assume independence between the observed effect sizes or outcomes obtained from a set of studies. This assumption is often violated in practice. Dependencies can arise for a variety of reasons. For example, the sampling errors and/or true effects/outcomes may be correlated in multiple treatment studies (e.g., when multiple treatment groups are compared with a common control/reference group, such that the data from the control/reference group is used multiple times to compute the observed effect sizes or outcomes) or in multiple endpoint studies (e.g., when more than one effect size estimate or outcome is calculated based on the same sample of subjects due to the use of multiple endpoints or response variables) (Gleser & Olkin, 2009). Correlations in the true effects/outcomes can also arise due to other forms of clustering (e.g., effects/outcomes derived from the same paper, lab, research group, or species may be more similar to each other than effects/outcomes derived from different papers, labs, research groups, or species). In ecology and related fields, shared phylogenetic history among the organisms studied (e.g., plants, fungi, animals) can also induce correlations among the effects/outcomes. The rma.mv function can be used to fit suitable meta-analytic multivariate/multilevel models to such data, so that the non-independence in the observed/true effects or outcomes is accounted for. Network meta-analyses (also called multiple/mixed treatment comparisons) can also be carried out with this function.

Future Plans and Updates

The metafor package is a work in progress and is updated on a regular basis with new functions and options. With metafor.news(), you can read the NEWS file of the package after installation. Comments, feedback, and suggestions for improvements are always welcome.

Citing the Package

To cite the package, please use the following reference:

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1--48. https://doi.org/10.18637/jss.v036.i03

Getting Started with the Package

The paper mentioned above is a good starting place for those interested in using the package. The purpose of the article is to provide a general overview of the package and its capabilities (as of version 1.4-0). Not all of the functions and options are described in the paper, but it should provide a useful introduction to the package. The paper can be freely downloaded from the URL given above or can be directly loaded with the command vignette("metafor").

In addition to reading the paper, carefully read this page and then the help pages for the escalc and the rma.uni functions (or the rma.mh, rma.peto, rma.glmm, rma.mv functions if you intend to use these methods). The help pages for these functions provide links to many additional functions, which can be used after fitting a model. You can also read the entire documentation online at https://wviechtb.github.io/metafor/reference/index.html (where it is nicely formatted, equations are shown correctly, and the output from all examples is provided).

A (pdf) diagram showing the various functions in the metafor package (and how they are related to each other) can be opened with the command vignette("diagram").

Finally, additional information about the package, several detailed analysis examples, examples of plots and figures provided by the package (with the corresponding code), some additional tips and notes, and a FAQ can be found on the package website at https://www.metafor-project.org.

Author

Wolfgang Viechtbauer wvb@metafor-project.org
package website: https://www.metafor-project.org
author homepage: https://www.wvbauer.com

Suggestions on how to obtain help with using the package can found on the package website at: https://www.metafor-project.org/doku.php/help

References

Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds.) (2009). The handbook of research synthesis and meta-analysis (2nd ed.). New York: Russell Sage Foundation.

Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect sizes. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 357--376). New York: Russell Sage Foundation.

Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press.

Hedges, L. V., & Vevea, J. L. (1998). Fixed- and random-effects models in meta-analysis. Psychological Methods, 3(4), 486--504. https://doi.org/10.1037/1082-989X.3.4.486

Laird, N. M., & Mosteller, F. (1990). Some statistical methods for combining experimental results. International Journal of Technology Assessment in Health Care, 6(1), 5--30. https://doi.org/10.1017/S0266462300008916

Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22(4), 719--748. https://doi.org/10.1093/jnci/22.4.719

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1--48. https://doi.org/10.18637/jss.v036.i03

Yusuf, S., Peto, R., Lewis, J., Collins, R., & Sleight, P. (1985). Beta blockade during and after myocardial infarction: An overview of the randomized trials. Progress in Cardiovascular Disease, 27(5), 335--371. https://doi.org/10.1016/s0033-0620(85)80003-7