Function to fit meta-analytic fixed- and random/mixed-effects models with or without moderators via generalized linear (mixed-effects) models. See below and the documentation of the metafor-package for more details on these models.

rma.glmm(ai, bi, ci, di, n1i, n2i, x1i, x2i, t1i, t2i, xi, mi, ti, ni,
         mods, measure, intercept=TRUE, data, slab, subset,
         add=1/2, to="only0", drop00=TRUE, vtype="LS",
         model="UM.FS", method="ML", test="z",
         level=95, digits, btt, nAGQ=7, verbose=FALSE, control, …)

Arguments

ai

see below and the documentation of the escalc function for more details.

bi

see below and the documentation of the escalc function for more details.

ci

see below and the documentation of the escalc function for more details.

di

see below and the documentation of the escalc function for more details.

n1i

see below and the documentation of the escalc function for more details.

n2i

see below and the documentation of the escalc function for more details.

x1i

see below and the documentation of the escalc function for more details.

x2i

see below and the documentation of the escalc function for more details.

t1i

see below and the documentation of the escalc function for more details.

t2i

see below and the documentation of the escalc function for more details.

xi

see below and the documentation of the escalc function for more details.

mi

see below and the documentation of the escalc function for more details.

ti

see below and the documentation of the escalc function for more details.

ni

see below and the documentation of the escalc function for more details.

mods

optional argument to include one or more moderators in the model. A single moderator can be given as a vector of length \(k\) specifying the values of the moderator. Multiple moderators are specified by giving a matrix with \(k\) rows and as many columns as there are moderator variables. Alternatively, a model formula can be used to specify the model. See ‘Details’.

measure

character string indicating the outcome measure to use for the meta-analysis. Possible options are the odds ratio ("OR"), the incidence rate ratio ("IRR"), the logit transformed proportion ("PLO"), or the log transformed incidence rate ("IRLN").

intercept

logical indicating whether an intercept should be added to the model (the default is TRUE).

data

optional data frame containing the data supplied to the function.

slab

optional vector with labels for the \(k\) studies.

subset

optional vector indicating the subset of studies that should be used for the analysis. This can be a logical vector of length \(k\) or a numeric vector indicating the indices of the observations to include.

add

non-negative number indicating the amount to add to zero cells, counts, or frequencies when calculating the observed outcomes of the individual studies. See below and the documentation of the escalc function for more details.

to

character string indicating when the values under add should be added (either "only0", "all", "if0all", or "none"). See below and the documentation of the escalc function for more details.

drop00

logical indicating whether studies with no cases/events (or only cases) in both groups should be dropped. See the documentation of the escalc function for more details.

vtype

character string indicating the type of sampling variances to calculate when calculating the observed outcomes. See the documentation of the escalc function for more details.

model

character string specifying the general model type to use for the analysis (either "UM.FS" (the default), "UM.RS", "CM.EL", or "CM.AL"). See ‘Details’.

method

character string specifying whether a fixed- or a random/mixed-effects model should be fitted. A fixed-effects model (with or without moderators) is fitted when using method="FE". Random/mixed-effects models are fitted by setting method="ML" (the default). See ‘Details’.

test

character string specifying how test statistics and confidence intervals for the fixed effects should be computed. By default (test="z"), Wald-type tests and CIs are obtained, which are based on a standard normal distribution. When test="t", a t-distribution is used instead. See ‘Details’.

level

numerical value between 0 and 100 specifying the confidence interval level (the default is 95).

digits

integer specifying the number of decimal places to which the printed results should be rounded (if unspecified, the default is 4).

btt

optional vector of indices specifying which coefficients to include in the omnibus test of moderators. See ‘Details’.

nAGQ

positive integer specifying the number of points per axis for evaluating the adaptive Gauss-Hermite approximation to the log-likelihood. The default is 7. Setting this to 1 corresponds to the Laplacian approximation. See ‘Note’.

verbose

logical indicating whether output should be generated on the progress of the model fitting (the default is FALSE). Can also be an integer. Values > 1 generate more verbose output. See ‘Note’.

control

optional list of control values for the estimation algorithms. If unspecified, default values are defined inside the function. See ‘Note’.

additional arguments.

Details

Specifying the Data

The function can be used in conjunction with the following effect size or outcome measures:

  • measure="OR" for odds ratios (analyzed in log units)

  • measure="IRR" for incidence rate ratios (analyzed in log units)

  • measure="PLO" for logit transformed proportions (i.e., log odds)

  • measure="IRLN" for log transformed incidence rates.

The escalc function describes the data/arguments that should be specified/used for these measures.

Specifying the Model

A variety of model types are available when analyzing \(2 \times 2\) table data (i.e., when measure="OR") or two-group event count data (i.e., when measure="IRR"):

  • model="UM.FS" for an unconditional generalized linear mixed-effects model with fixed study effects

  • model="UM.RS" for an unconditional generalized linear mixed-effects model with random study effects

  • model="CM.AL" for a conditional generalized linear mixed-effects model (approximate likelihood)

  • model="CM.EL" for a conditional generalized linear mixed-effects model (exact likelihood).

For measure="OR", models "UM.FS" and "UM.RS" are essentially (mixed-effects) logistic regression models, while for measure="IRR", these models are (mixed-effects) Poisson regression models. A choice must be made on how to model study level variability (i.e., differences in outcomes across studies irrespective of group membership). One can choose between using fixed study effects (which means that \(k\) dummy variables are added to the model) or random study effects (which means that random effects corresponding to the levels of the study factor are added to the model).

The conditional model (model="CM.EL") avoids having to model study level variability by conditioning on the total numbers of cases/events in each study. For measure="OR", this leads to a non-central hypergeometric distribution for the data within each study and the corresponding model is then a (mixed-effects) conditional logistic model. Fitting this model can be difficult and computationally expensive. When the number of cases in each study is small relative to the group sizes, one can approximate the exact likelihood by a binomial distribution, which leads to a regular (mixed-effects) logistic regression model (model="CM.AL"). For measure="IRR", the conditional model leads directly to a binomial distribution for the data within each study and the resulting model is again a (mixed-effects) logistic regression model (no approximate likelihood model is needed here).

When analyzing proportions (i.e., measure="PLO") or incidence rates (i.e., measure="IRLN") of individual groups, the model type is always a (mixed-effects) logistic or Poisson regression model, respectively (i.e., the model argument is not relevant here).

Aside from choosing the general model type, one has to decide whether to fit a fixed- or random-effects model to the data. A fixed-effects model is fitted by setting method="FE". A random-effects model is fitted by setting method="ML" (the default). Note that random-effects models with dichotomous data are often referred to as ‘binomial-normal’ models in the meta-analytic literature. Analogously, for event count data, such models could be referred to as ‘Poisson-normal’ models.

One or more moderators can be included in all of these models via the mods argument. A single moderator can be given as a (row or column) vector of length \(k\) specifying the values of the moderator. Multiple moderators are specified by giving an appropriate model matrix (i.e., \(X\)) with \(k\) rows and as many columns as there are moderator variables (e.g., mods = cbind(mod1, mod2, mod3), where mod1, mod2, and mod3 correspond to the names of the variables for three moderator variables). The intercept is added to the model matrix by default unless intercept=FALSE.

Alternatively, one can use the standard formula syntax to specify the model. In this case, the mods argument should be set equal to a one-sided formula of the form mods = ~ model (e.g., mods = ~ mod1 + mod2 + mod3). Interactions, polynomial terms, and factors can be easily added to the model in this manner. When specifying a model formula via the mods argument, the intercept argument is ignored. Instead, the inclusion/exclusion of the intercept is controlled by the specified formula (e.g., mods = ~ mod1 + mod2 + mod3 - 1 would lead to the removal of the intercept). With moderators, a fixed-effects with moderators model is then fitted by setting method="FE", while a mixed-effects model is fitted by setting method="ML".

Fixed-, Saturated-, and Random/Mixed-Effects Models

When fitting a particular model, actually up to three different models are fitted within the function:

  • the fixed-effects model (i.e., where \(\tau^2\) is set to 0),

  • the saturated model (i.e., the model with a deviance of 0), and

  • the random/mixed-effects model (i.e., where \(\tau^2\) is estimated) (only if method="ML").

The saturated model is obtained by adding as many dummy variables to the model as needed so that the model deviance is equal to zero. Even when method="ML", the fixed-effects and saturated models are fitted, as they are used to compute the test statistics for the Wald-type and likelihood ratio tests for (residual) heterogeneity (see below).

Omnibus Test of Parameters

For models including moderators, an omnibus test of all the model coefficients is conducted that excludes the intercept (the first coefficient) if it is included in the model. If no intercept is included in the model, then the omnibus test includes all of the coefficients in the model including the first. Alternatively, one can manually specify the indices of the coefficients to test via the btt argument. For example, with btt=c(3,4), only the third and fourth coefficient from the model would be included in the test (if an intercept is included in the model, then it corresponds to the first coefficient in the model).

Categorical Moderators

Categorical moderator variables can be included in the model via the mods argument in the same way that appropriately (dummy) coded categorical independent variables can be included in linear models. One can either do the dummy coding manually or use a model formula together with the factor function to let R handle the coding automatically.

Tests and Confidence Intervals

By default, the test statistics of the individual coefficients in the model (and the corresponding confidence intervals) are based on a standard normal distribution, while the omnibus test is based on a chi-square distribution with \(m\) degrees of freedom (\(m\) being the number of coefficients tested). As an alternative, one can set test="t", which slightly mimics the Knapp and Hartung (2003) method by using a t-distribution with \(k-p\) degrees of freedom for tests of individual coefficients and confidence intervals and an F-distribution with \(m\) and \(k-p\) degrees of freedom (\(p\) being the total number of model coefficients including the intercept if it is present) for the omnibus test statistic (but note that test="t" is not the same as test="knha" in rma.uni, as no adjustment to the standard errors of the estimated coefficients is made).

Tests for (Residual) Heterogeneity

Two different tests for (residual) heterogeneity are automatically carried out by the function. The first is a Wald-type test, which tests the coefficients corresponding to the dummy variables added in the saturated model for significance. The second is a likelihood ratio test, which tests the same set of coefficients, but does so by computing \(-2\) times the difference in the log-likelihood of the fixed-effects and the saturated model. These two tests are not identical for the types of models fitted by the rma.glmm function and may even lead to conflicting conclusions.

Observed Outcomes of the Individual Studies

The various models do not require the calculation of the observed outcomes of the individual studies (e.g., the observed odds ratios of the \(k\) studies) and directly make use of the table/event counts. Zero cells/events are not a problem (except in extreme cases, such as when one of the two outcomes never occurs or when there are no events in any of the studies). Therefore, it is unnecessary to add some constant to the cell/event counts when there are zero cells/events.

However, for plotting and various other functions, it is necessary to calculate the observed outcomes for the \(k\) studies. Here, zero cells/events can be problematic, so adding a constant value to the cell/event counts ensures that all \(k\) values can be calculated. The add and to arguments are used to specify what value should be added to the cell/event counts and under what circumstances when calculating the observed outcomes. The documentation of the escalc function explains how the add and to arguments work. Note that drop00 is set to TRUE by default, since studies where ai=ci=0 or bi=di=0 or studies where x1i=x2i=0 are uninformative about the size of the effect.

Value

An object of class c("rma.glmm","rma"). The object is a list containing the following components:

beta

estimated coefficients of the model.

se

standard errors of the coefficients.

zval

test statistics of the coefficients.

pval

p-values for the test statistics.

ci.lb

lower bound of the confidence intervals for the coefficients.

ci.ub

upper bound of the confidence intervals for the coefficients.

vb

variance-covariance matrix of the estimated coefficients.

tau2

estimated amount of (residual) heterogeneity. Always 0 when method="FE".

sigma2

estimated amount of study level variability (only for model="UM.RS").

k

number of studies included in the model.

p

number of coefficients in the model (including the intercept).

m

number of coefficients included in the omnibus test of coefficients.

QE.Wld

Wald-type test statistic for the test of (residual) heterogeneity.

QEp.Wld

p-value for the Wald-type test of (residual) heterogeneity.

QE.LRT

likelihood ratio test statistic for the test of (residual) heterogeneity.

QEp.LRT

p-value for the likelihood ratio test of (residual) heterogeneity.

QM

test statistic for the omnibus test of coefficients.

QMp

p-value for the omnibus test of coefficients.

I2

value of \(I^2\).

H2

value of \(H^2\).

int.only

logical that indicates whether the model is an intercept-only model.

yi, vi, X

the vector of outcomes, the corresponding sampling variances, and the model matrix.

fit.stats

a list with the log-likelihood, deviance, AIC, BIC, and AICc values.

some additional elements/values.

Methods

The results of the fitted model are formatted and printed with the print.rma.glmm function. If fit statistics should also be given, use summary.rma (or use the fitstats.rma function to extract them).

Note

Fitting the various types of models requires several different iterative algorithms:

  • For model="UM.FS" and model="CM.AL", iteratively reweighted least squares (IWLS) as implemented in the glm function is used for fitting the fixed-effects and the saturated models. For method="ML", adaptive Gauss-Hermite quadrature as implemented in the glmer function is used. The same applies when model="CM.EL" is used in combination with measure="IRR" or when measure="PLO" or measure="IRLN" (regardless of the model type).

  • For model="UM.RS", adaptive Gauss-Hermite quadrature as implemented in the glmer function is used to fit all of the models.

  • For model="CM.EL" and measure="OR", the quasi-Newton method ("BFGS") as implemented in the optim function is used by default for fitting the fixed-effects and the saturated models. For method="ML", the same algorithm is used, together with adaptive quadrature as implemented in the integrate function (for the integration over the density of the non-central hypergeometric distribution). Standard errors of the parameter estimates are obtained by inverting the Hessian, which is numerically approximated using the hessian function.

When model="CM.EL" and measure="OR", actually model="CM.AL" is used first to obtain starting values for optim, so either 4 (if method="FE") or 6 (if method="ML") models need to be fitted in total.

Various control parameters can be adjusted via the control argument:

  • optimizer is set by default to "optim", but can be set to "nlminb" or one of the optimizers from the minqa package (i.e., "bobyqa", "newuoa", or "uobyqa"),

  • optmethod is used to set the method argument for optim (the default is "BFGS"),

  • optCtrl is a list of named arguments to be passed on to the control argument of the chosen optimizer,

  • glmCtrl is a list of named arguments to be passed on to the control argument of the glm function,

  • glmerCtrl is a list of named arguments to be passed on to the control argument of the glmer function, and

  • intCtrl is a list of named arguments (i.e., rel.tol and subdivisions) to be passed on to the integrate function.

  • hessianCtrl is a list of named arguments to be passed on to the method.args argument of the hessian function. For some borderline cases, it may be necessary to bump up the r argument to a higher number to get sufficient accuracy when approximating the Hessian numerically (the default is control=list(hessianCtrl=list(r=16))).

Also, for glmer, the nAGQ argument is used to specify the number of quadrature points. The default value is 7, which should provide sufficient accuracy in the evaluation of the log-likelihood in most cases, but at the expense of speed. Setting this to 1 corresponds to the Laplacian approximation (which is faster, but less accurate).

Information on the progress of the various algorithms is obtained by setting verbose=TRUE or with control=list(verbose=TRUE). Since fitting the various models can be computationally expensive, this option is quite useful to determine how the model fitting is progressing.

For model="CM.EL" and measure="OR", optimization involves repeated calculation of the density of the non-central hypergeometric distribution. When method="ML", this also requires integration over the same density. This is currently implemented in a rather brute-force manner and may not be numerically stable, especially when models with moderators are fitted. Stability can be improved by scaling the moderators in a similar manner (i.e., don't use a moderator that is coded 0 and 1, while another uses values in the 1000s). For models with an intercept and moderators, the function actually rescales (non-dummy) variables to z-scores during the model fitting (results are given after back-scaling, so this should be transparent to the user). For models without an intercept, this is not done, so sensitivity analyses are highly recommended here (to ensure that the results do not depend on the scaling of the moderators).

References

Agresti, A. (2002). Categorical data analysis (2nd. ed). Hoboken, NJ: Wiley.

Bagos, P. G., & Nikolopoulos, G. K. (2009). Mixed-effects Poisson regression models for meta-analysis of follow-up studies with constant or varying durations. The International Journal of Biostatistics, 5(1), article 21.

van Houwelingen, H. C., Zwinderman, K. H., & Stijnen, T. (1993). A bivariate approach to meta-analysis. Statistics in Medicine, 12, 2273--2284.

Liao, J. G., & Rosen, O. (2001). Fast and stable algorithms for computing and sampling from the noncentral hypergeometric distribution. American Statistician, 55, 366--369.

Stijnen, T., Hamza, T. H., & Ozdemir, P. (2010). Random effects meta-analysis of event outcome in the framework of the generalized linear mixed model with applications in sparse data. Statistics in Medicine, 29, 3046--3067.

Turner, R. M., Omar, R. Z., Yang, M., Goldstein, H., & Thompson, S. G. (2000). A multilevel model framework for meta-analysis of clinical trials with binary outcomes. Statistics in Medicine, 19, 3417--3432.

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1--48. http://www.jstatsoft.org/v36/i03/.

See also

rma.uni, rma.mh, rma.peto, and rma.mv for other model fitting functions.

dat.nielweise2007, dat.nielweise2008, dat.collins1985a, and dat.pritz1997 for further examples of the use of the rma.glmm function.

Examples

### random-effects model using rma.uni() (standard RE model analysis) rma(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, method="ML", verbose=TRUE)
#> #> Iteration 0 tau^2 = 0.3495 #> Iteration 1 tau^2 = 0.3034 #> Iteration 2 tau^2 = 0.3025 #> Iteration 3 tau^2 = 0.3025 #> Iteration 4 tau^2 = 0.3025 #> Fisher scoring algorithm converged after 4 iterations. #>
#> #> Random-Effects Model (k = 13; tau^2 estimator: ML) #> #> tau^2 (estimated amount of total heterogeneity): 0.3025 (SE = 0.1549) #> tau (square root of estimated tau^2 value): 0.5500 #> I^2 (total heterogeneity / total variability): 91.23% #> H^2 (total variability / sampling variability): 11.40 #> #> Test for Heterogeneity: #> Q(df = 12) = 163.1649, p-val < .0001 #> #> Model Results: #> #> estimate se zval pval ci.lb ci.ub #> -0.7420 0.1780 -4.1694 <.0001 -1.0907 -0.3932 *** #> #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #>
### random-effects models using rma.glmm() (require 'lme4' package) ### unconditional model with fixed study effects
# NOT RUN { rma.glmm(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, model="UM.FS", verbose=TRUE) # }
### unconditional model with random study effects
# NOT RUN { rma.glmm(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, model="UM.RS", verbose=TRUE) # }
### conditional model with approximate likelihood
# NOT RUN { rma.glmm(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, model="CM.AL", verbose=TRUE) # }
### conditional model with exact likelihood (be patient!) ### note: fitting this model is very slow, so be patient
# NOT RUN { rma.glmm(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, model="CM.EL", verbose=TRUE) # }