The metafor package provides a comprehensive collection of functions for conducting meta-analyses in R. The package includes functions for calculating various effect size or outcome measures frequently used in meta-analyses (e.g., risk differences, risk ratios, odds ratios, standardized mean differences, Fisher's r-to-z-transformed correlation coefficients) and then allows the user to fit fixed-, random-, and mixed-effects models to these data. By including study-level covariates (‘moderators’) in these models, so-called ‘meta-regression’ analyses can be carried out. For meta-analyses of \(2 \times 2\) tables, proportions, incidence rates, and incidence rate ratios, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects (conditional) logistic and Poisson regression models). For non-independent effect sizes or outcomes (e.g., due to correlated sampling errors, correlated true effects or outcomes, or other forms of clustering), the package also provides a function for fitting multilevel/multivariate meta-analytic models.

Various methods are available to assess model fit, to identify outliers and/or influential studies, and for conducting sensitivity analyses (e.g., standardized residuals, Cook's distances, leave-one-out analyses). Advanced techniques for hypothesis tests and obtaining confidence intervals (e.g., for the average effect size or for the model coefficients in a meta-regression model) have also been implemented (e.g., the Knapp and Hartung method, permutation tests).

The package also provides functions for creating forest, funnel, radial (Galbraith), normal quantile-quantile, L'Abbé, Baujat, and GOSH plots. The presence of funnel plot asymmetry (which may be indicative of publication bias) and its potential impact on the results can be examined via the rank correlation and Egger's regression test and by applying the trim and fill method.

The rma.uni Function

[rma.uni] The various meta-analytic models that are typically used in practice are special cases of the general linear (mixed-effects) model. The rma.uni function (with alias rma) provides a general framework for fitting such models. The function can be used in conjunction with any of the usual effect size or outcome measures used in meta-analyses (e.g., log risk ratios, log odds ratios, risk differences, mean differences, standardized mean differences, raw correlation coefficients, correlation coefficients transformed with Fisher's r-to-z transformation, and so on). For details on these effect size or outcome measures, see the documentation of the escalc function. The notation and models underlying the rma.uni function are explained below.

For a set of \(i = 1, \ldots, k\) independent studies, let \(y_i\) denote the observed value of the effect size or outcome measure in the \(i\)th study. Let \(\theta_i\) denote the corresponding (unknown) true effect or outcome, such that $$y_i | \theta_i \sim N(\theta_i, v_i).$$ In other words, the observed effects or outcomes are assumed to be unbiased and normally distributed estimates of the corresponding true effects or outcomes with sampling variances equal to \(v_i\). The \(v_i\) values are assumed to be known. Depending on the outcome measure used, a bias correction, normalizing, and/or variance stabilizing transformation may be necessary to ensure that these assumptions are (approximately) true (e.g., the log transformation for odds ratios, the bias correction for standardized mean differences, Fisher's r-to-z transformation for correlations; see escalc for more details).

The fixed-effects model conditions on the true effects/outcomes and therefore provides a conditional inference about the set of \(k\) studies included in the meta-analysis. When using weighted estimation, this implies that the fitted model provides an estimate of $$\theta_w = \sum w_i \theta_i / \sum w_i,$$ that is, the weighted average of the true effects/outcomes in the set of \(k\) studies, with weights equal to \(w_i = 1/v_i\) (this is what is often described as the ‘inverse-variance’ method in the meta-analytic literature). One can also employ an unweighted estimation method, which provides an estimate of the unweighted average of the true effects/outcomes in the set of \(k\) studies, that is, an estimate of $$\theta_u = \sum \theta_i / k.$$

For weighted estimation, one could also choose to estimate \(\bar{\theta}_w\), where the \(w_i\) values are user-defined weights (inverse-variance weights or unity weights as in unweighted estimation are just special cases). It is up to the user to decide to what extent \(\bar{\theta}_w\) is a meaningful parameter to estimate (regardless of the weights used).

Moderators can be included in the fixed-effects model, yielding a fixed-effects with moderators model. Again, since the model conditions on the set of \(k\) studies included in the meta-analysis, the regression coefficients from the fitted model estimate the weighted relationship (in the least squares sense) between the true effects/outcomes and the moderator variables within the set of \(k\) studies included in the meta-analysis (again using weights equal to \(w_i = 1/v_i\)). The unweighted relationship between the true effects/outcomes and the moderator variables can be estimated when using the unweighted estimation method. Again, user-defined weights could also be used.

The random-effects model does not condition on the true effects/outcomes. Instead, the \(k\) studies included in the meta-analysis are assumed to be a random selection from a larger population of studies. In rare cases, the studies included in a meta-analysis are actually sampled from a larger collection of studies. More typically, the population of studies is a hypothetical population of an essentially infinite set of studies comprising all of the studies that have been conducted, that could have been conducted, or that may be conducted in the future. We assume that \(\theta_i \sim N(\mu, \tau^2),\) that is, the true effects/outcomes in the population of studies are normally distributed with \(\mu\) denoting the average true effect/outcome and \(\tau^2\) denoting the variance of the true effects/outcomes in the population (\(\tau^2\) is therefore often referred to as the ‘amount of heterogeneity’ in the true effects/outcomes). The random-effects model can therefore also be written as $$y_i = \mu + u_i + e_i,$$ where \(u_i \sim N(0, \tau^2)\) and \(e_i \sim N(0, v_i)\). The fitted model provides an estimate of \(\mu\) and \(\tau^2\). Consequently, the random-effects model provides an unconditional inference about the average true effect/outcome in the population of studies (from which the \(k\) studies included in the meta-analysis are assumed to be a random selection).

When including moderator variables in the random-effects model, we obtain what is typically called a mixed-effects model in the meta-analytic literature. Such a meta-regression model can also be written as $$y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \ldots + u_i + e_i,$$ where \(u_i \sim N(0, \tau^2)\) and \(e_i \sim N(0, v_i)\) as before, \(x_{i1}\) denotes the value of the first moderator variable for the \(i\)th study, \(x_{i2}\) denotes the value of the second moderator variable for the \(i\)th study, and so on (letting \(p\) denote the total number of coefficients in the model, including the model intercept if it is included). Therefore, \(\beta_1\) denotes how the average true effect/outcome changes for a one unit increase in \(x_{i1}\), \(\beta_2\) denotes how the average true effect/outcome changes for a one unit increase in \(x_{i2}\), and so on, and the model intercept \(\beta_0\) denotes the average true effect/outcome when the values of all moderator variables are equal to zero. The coefficients from the fitted model therefore estimate the relationship between the average true effect/outcome in the population of studies and the moderator variables included in the model. The value of \(\tau^2\) in the mixed-effects model denotes the ‘amount of residual heterogeneity’ in the true effects/outcomes (i.e., the amount of variability in the true effects//outcomes that is not accounted for by the moderators included in the model).

When using weighted estimation in the context of a random-effects model, the model is fitted with weights equal to \(w_i = 1/(\tau^2 + v_i)\), with \(\tau^2\) replaced by its estimate (again, this is the standard ‘inverse-variance’ method for random-effects models). One can also choose unweighted estimation in the context of the random-effects model and even any user-defined weights, although the parameter that is estimated (i.e., \(\mu\)) remains the same regardless of the estimation method and weights used (as opposed to the fixed-effect model case, where the parameter estimated is different for weighted versus unweighted estimation or when using different weights than the standard inverse-variance weights). Since weighted estimation with inverse-variance weights is most efficient, it is usually to be preferred for random-effects models (while in the fixed-effect model case, we must carefully consider whether \(\bar{\theta}_w\) or \(\bar{\theta}_u\) is the more meaningful parameter to estimate). The same principle applies to mixed-effects (versus fixed-effects with moderators) models.

Contrary to what is often stated in the literature, it is important to realize that the fixed-effects model does not assume that the true effects/outcomes are homogeneous (i.e., that \(\theta_i\) is equal to some common value \(\theta\) in all \(k\) studies). In other words, fixed-effects models provide perfectly valid inferences under heterogeneity, as long as one is restricting these inferences to the set of studies included in the meta-analysis and one realizes that the model does not provide an estimate of \(\theta\), but of \(\bar{\theta}_w\) or \(\bar{\theta}_u\).

In the special case that the true effects/outcomes are actually homogeneous (the equal-effects case), the distinction between fixed- and random-effects models disappears, since homogeneity implies that \(\mu = \bar{\theta}_w = \bar{\theta}_u \equiv \theta\). However, since there is no infallible method to test whether the true effects/outcomes are really homogeneous or not, a researcher should decide on the type of inference desired before examining the data and choose the model accordingly. In fact, there is nothing wrong with fitting both fixed- and random/mixed-effects models to the same data, since these models address different questions. For more details on the distinction between equal-, fixed-, and random-effects models, see Laird and Mosteller (1990) and Hedges and Vevea (1998).

The rma.mh Function

[rma.mh] The Mantel-Haenszel method provides an alternative approach for fitting fixed-effects models when dealing with studies providing data in the form of \(2 \times 2\) tables or in the form of event counts (i.e., person-time data) for two groups (Mantel & Haenszel, 1959). The method is particularly advantageous when aggregating a large number of studies with small sample sizes (the so-called sparse data or increasing strata case). The Mantel-Haenszel method is implemented in the rma.mh function. It can be used in combination with risk ratios, odds ratios, risk differences, incidence rate ratios, and incidence rate differences. The Mantel-Haenszel method is always based on a weighted estimation approach.

The rma.peto Function

[rma.peto] Yet another method that can be used in the context of a meta-analysis of \(2 \times 2\) table data is Peto's method (see Yusuf et al., 1985), implemented in the rma.peto function. The method provides a weighted estimate of the (log) odds ratio under a fixed-effects model. The method is particularly advantageous when the event of interest is rare, but see the documentation of the function for some caveats.

The rma.glmm Function

[rma.glmm] Dichotomous outcomes and event counts (based on which one can calculate effect size or outcome measures such as odds ratios, incidence rate ratios, proportions, and incidence rates) are often assumed to arise from binomial and Poisson distributed data. Meta-analytic models that are directly based on such distributions are implemented in the rma.glmm function. These models are essentially special cases of generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). For \(2 \times 2\) table data, a mixed-effects conditional logistic model (based on the non-central hypergeometric distribution) is also available. Random/mixed-effects models with dichotomous data are often referred to as ‘binomial-normal’ models in the meta-analytic literature. Analogously, for event count data, such models could be referred to as ‘Poisson-normal’ models.

The rma.mv Function

[rma.mv] Standard meta-analytic models assume independence between the observed effects or outcomes obtained from a set of studies. This assumption is often violated in practice. Dependencies can arise for a variety of reasons. For example, the sampling errors and/or true effects/outcomes may be correlated in multiple treatment studies (e.g., when multiple treatment groups are compared with a common control/reference group, such that the data from the control/reference group is used multiple times to compute the effect sizes or outcomes) or in multiple endpoint studies (e.g., when more than one effect size estimate or outcome is calculated based on the same sample of subjects due to the use of multiple endpoints or response variables) (Gleser & Olkin, 2009). Correlations in the true effects/outcomes can also arise due to other forms of clustering (e.g., effects/outcomes derived from the same paper, lab, research group, or species may be more similar to each other than effects/outcomes derived from different papers, labs, research groups, or species). In ecology and related fields, shared phylogenetic history among the organisms studied (e.g., plants, fungi, animals) can also induce correlations among the effects/outcomes. The rma.mv function can be used to fit suitable meta-analytic multivariate/multilevel models to such data, so that the non-independence in the observed/true effects or outcomes is accounted for. Network meta-analyses (also called multiple/mixed treatment comparison meta-analyses) can also be carried out with this function.

Future Plans and Updates

The metafor package is a work in progress and is updated on a regular basis with new functions and options. With metafor.news(), you can read the NEWS file of the package after installation. Comments, feedback, and suggestions for improvements are very welcome.

Citing the Package

To cite the package, please use the following reference:

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1--48. http://www.jstatsoft.org/v36/i03/.

Getting Started with the Package

The paper mentioned above is a good starting place for those interested in using the metafor package. The purpose of the article is to provide a general overview of the package and its capabilities (as of version 1.4-0). Not all of the functions and options are described in the paper, but it should provide a useful introduction to the package. The paper can be freely downloaded from the URL given above or can be directly loaded with the command vignette("metafor").

In addition to reading the paper, carefully read this page and then the help pages for the escalc and the rma.uni functions (or the rma.mh, rma.peto, rma.glmm, rma.mv functions if you intend to use these methods). The help pages for these functions provide links to many additional functions, which can be used after fitting a model. You can also read the entire documentation online at https://wviechtb.github.io/metafor/reference/index.html (where it is nicely formatted, equations are shown correctly, and the output from all examples is provided).

A (pdf) diagram showing the various functions in the metafor package (and how they are related to each other) can be opened with the command vignette("metafor_diagram").

Finally, additional information about the package, several detailed analysis examples, examples of plots and figures provided by the package (with the corresponding code), some additional tips and notes, and a FAQ can be found on the package website at http://www.metafor-project.org/.

References

Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds.) (2009). The handbook of research synthesis and meta-analysis (2nd ed.). New York: Russell Sage Foundation.

Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect sizes. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 357--376). New York: Russell Sage Foundation.

Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press.

Hedges, L. V., & Vevea, J. L. (1998). Fixed- and random-effects models in meta-analysis. Psychological Methods, 3, 486--504.

Laird, N. M., & Mosteller, F. (1990). Some statistical methods for combining experimental results. International Journal of Technology Assessment in Health Care, 6, 5--30.

Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719--748.

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1--48. http://www.jstatsoft.org/v36/i03/.

Yusuf, S., Peto, R., Lewis, J., Collins, R., & Sleight, P. (1985). Beta blockade during and after myocardial infarction: An overview of the randomized trials. Progress in Cardiovascular Disease, 27, 335--371.