General Notes / Setup

As the title promises, the book Introduction to Meta-Analysis by Borenstein et al. (2009) is a comprehensive introduction to standard meta-analytic methodology, with focus on the statistical methods. This document provides the R code to reproduce all examples and illustrations from the book using the metafor package. To read more about the package, see the package website and the package documentation. Note that the ‘devel’ version of metafor needs to be installed as some of the datasets used below are not currently in the official release of the package on CRAN. The ‘devel’ version of the package can be installed with:

install.packages("remotes")
remotes::install_github("wviechtb/metafor")

This step will become obsolete once a new release of the metafor package is published on CRAN.

Once the package is installed, we can load it with:

library(metafor)

A few additional notes:

  1. Results are only reproduced for chapters containing worked examples.
  2. Occasionally, there are some minor discrepancies between the results shown in the book and those obtained below. Where such discrepancies arise, they are noted (and the reasons for them).
  3. I did not attempt to reproduce the figures exactly as they appear in the book. There are some fundamental differences in the aesthetics of the figures shown in the book and the functions in the metafor package for producing the corresponding figures. I made some adjustments to the defaults here and there to make the figures below look similar to the ones shown in the book and made sure to include all relevant elements.
  4. The results are generally given without discussion or context. The code below is not a substitute for reading the book, but is meant to be used together with it. In other words, readers of the book interested in replicating the results with R can see here how this is possible.

1) How a Meta-Analysis Works

# analysis of the Canon et al. (2006) data using (log) risk ratios

dat <- dat.cannon2006[,1:6] # keep only variables we really need
dat <- escalc(measure="RR", ai=ep1t, n1i=nt, ci=ep1c, n2i=nc, data=dat, slab=trial)
dat
##      trial        pop   nt   nc ep1t ep1c      yi     vi 
## 1 PROVE IT   post-ACS 2099 2063  147  172 -0.1744 0.0117 
## 2   A-TO-Z   post-ACS 2265 2232  205  235 -0.1513 0.0082 
## 3      TNT stable CAD 4995 5006  334  418 -0.2221 0.0050 
## 4    IDEAL stable CAD 4439 4449  411  463 -0.1169 0.0041
res <- rma(yi, vi, data=dat, method="DL")
res
## Random-Effects Model (k = 4; tau^2 estimator: DL)
## 
## tau^2 (estimated amount of total heterogeneity): 0 (SE = 0.0053)
## tau (square root of estimated tau^2 value):      0
## I^2 (total heterogeneity / total variability):   0.00%
## H^2 (total variability / sampling variability):  1.00
## 
## Test for Heterogeneity:
## Q(df = 3) = 1.2425, p-val = 0.7428
## 
## Model Results:
## 
## estimate      se     zval    pval    ci.lb    ci.ub 
##  -0.1634  0.0393  -4.1635  <.0001  -0.2404  -0.0865  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
predict(res, transf=exp, digits=2) # summary effect size (risk ratio)
##  pred ci.lb ci.ub pi.lb pi.ub 
##  0.85  0.79  0.92  0.79  0.92
dat$weights <- paste0(round(weights(res)), "%")   # weights in % (rounded)
dat$pvals   <- round(summary(dat)$pval, digits=3) # p-values of the individual trials
# Figure 1.1

par(mar=c(4,4,2,2))

forest(res, xlim=c(-1,2), atransf=exp, at=log(c(2/3, 1, 3/2)),
       header=TRUE, top=2, mlab="Summary", efac=c(0,1,3),
       ilab=data.frame(dat$weights, dat$pvals), ilab.xpos=c(0.8,1.2), ilab.pos=2)
text(0.8, -1, "100%", pos=2)
text(1.2, -1, formatC(res$pval, format="f", digits=5), pos=2)
text(0.8,  6, "Weight",  pos=2, font=2)
text(1.2,  6, "P-Value", pos=2, font=2)


2) Why Perform a Meta-Analysis

# analysis of the Lau et al. (1992) data using (log) risk ratios

dat <- dat.lau1992
dat <- escalc(measure="RR", ai=ai, n1i=n1i, ci=ci, n2i=n2i, data=dat, slab=trial)
dat
##           trial year  ai  n1i   ci  n2i      yi     vi 
## 1      Fletcher 1959   1   12    4   11 -1.4733 1.0758 
## 2         Dewar 1963   4   21    7   21 -0.5596 0.2976 
## 3    European 1 1969  20   83   15   84  0.2997 0.0927 
## 4    European 2 1971  69  373   94  357 -0.3530 0.0196 
## 5   Heikinheimo 1971  22  219   17  207  0.2015 0.0949 
## 6       Italian 1971  19  164   18  157  0.0104 0.0957 
## 7  Australian 1 1973  26  264   32  253 -0.2502 0.0620 
## 8   Frankfurt 2 1973  13  102   29  104 -0.7829 0.0920 
## 9    NHLBI SMIT 1974   7   53    3   54  0.8660 0.4388 
## 10        Frank 1975   6   55    6   53 -0.0370 0.2963 
## 11       Valere 1975  11   49    9   42  0.0465 0.1578 
## 12        Klein 1976   4   14    1    9  0.9445 1.0675 
## 13    UK Collab 1976  38  302   40  293 -0.0815 0.0446 
## 14     Austrian 1977  37  352   65  376 -0.4975 0.0369 
## 15 Australian 2 1977  25  123   31  107 -0.3545 0.0548 
## 16     Lasierra 1977   1   13    3   11 -1.2657 1.1655 
## 17 N Ger Collab 1977  63  249   51  234  0.1492 0.0272 
## 18     Witchitz 1977   5   32    5   26 -0.2076 0.3303 
## 19   European 3 1979  18  156   30  159 -0.4918 0.0762 
## 20         ISAM 1986  54  859   63  882 -0.1277 0.0321 
## 21      GISSI-1 1986 628 5860  758 5852 -0.1895 0.0026 
## 22        Olson 1986   1   28    2   24 -0.8473 1.4226 
## 23     Baroffio 1986   0   29    6   30 -2.5322 2.0883 
## 24    Schreiber 1986   1   19    3   19 -1.0986 1.2281 
## 25      Cribier 1986   1   21    1   23  0.0910 1.9089 
## 26     Sainsous 1986   3   49    6   49 -0.6931 0.4592 
## 27       Durand 1987   3   35    4   29 -0.4757 0.5203 
## 28        White 1987   2  107   12  112 -1.7461 0.5651 
## 29      Bassand 1987   4   52    7   55 -0.5035 0.3554 
## 30         Vlay 1988   1   13    2   12 -0.7732 1.3397 
## 31      Kennedy 1988  12  191   17  177 -0.4244 0.1313 
## 32       ISIS-2 1988 791 8592 1029 8595 -0.2627 0.0020 
## 33    Wisenberg 1988   2   41    5   25 -1.4110 0.6356
res <- rma(yi, vi, data=dat, method="DL")
res
## Random-Effects Model (k = 33; tau^2 estimator: DL)
## 
## tau^2 (estimated amount of total heterogeneity): 0.0077 (SE = 0.0127)
## tau (square root of estimated tau^2 value):      0.0876
## I^2 (total heterogeneity / total variability):   16.87%
## H^2 (total variability / sampling variability):  1.20
## 
## Test for Heterogeneity:
## Q(df = 32) = 38.4942, p-val = 0.1991
## 
## Model Results:
## 
## estimate      se     zval    pval    ci.lb    ci.ub 
##  -0.2312  0.0468  -4.9345  <.0001  -0.3230  -0.1394  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
predict(res, transf=exp, digits=2) # summary effect size (risk ratio)
##  pred ci.lb ci.ub pi.lb pi.ub 
##  0.79  0.72  0.87  0.65  0.96
formatC(res$pval, format="f", digits=7) # p-value of the summary estimate
## [1] "0.0000008"
# Figure 2.1

par(mar=c(4,4,2,2))

forest(res, xlim=c(-10,9), atransf=exp, at=log(c(.01, 0.1, 1, 10, 100)),
       header=TRUE, top=2, ilab=dat$year, ilab.xpos=-6, digits=3L, cex=0.8,
       efac=c(0,1), fonts=c("sans", "inconsolata"))
# note: using a monospaced font for the annotations on the right, so they are
# fully aligned; if the "Inconsolata" font is not available, can also use "mono"
text(-6, 35, "Year", font=2, cex=0.8)


4) Effect Sizes Based on Means

# mean difference assuming sigma^2_1 = sigma^2_1

dat <- escalc("MD", m1i=103, m2i=100, sd1i=5.5, sd2i=4.5, n1i=50, n2i=50, vtype="HO")
summary(dat) # note: summary() so we can also see the standard error (sei)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 3.0000 1.0100 1.0050 2.9851 0.0028 1.0303 4.9697
# mean difference not assuming sigma^2_1 = sigma^2_1

dat <- escalc("MD", m1i=103, m2i=100, sd1i=5.5, sd2i=4.5, n1i=50, n2i=50)
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 3.0000 1.0100 1.0050 2.9851 0.0028 1.0303 4.9697
# note: since n1i=n2i in this example, the results are exactly the same
# mean change

# note: if the pre- and post-test means and SDs are not available, but we know
# the mean and SD of the change scores, then set m2i=0, sd2i=0, and ri=0

dat <- escalc("MC", m1i=5, m2i=0, sd1i=10, sd2i=0, ni=50, ri=0)
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 5.0000 2.0000 1.4142 3.5355 0.0004 2.2282 7.7718
dat <- escalc("MC", m1i=105, m2i=100, sd1i=10, sd2i=10, ni=50, ri=0.5)
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 5.0000 2.0000 1.4142 3.5355 0.0004 2.2282 7.7718
# standardized mean difference (Hedges' g)

dat <- escalc("SMD", m1i=103, m2i=100, sd1i=5.5, sd2i=4.5, n1i=50, n2i=50, vtype="LS2")
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 0.5924 0.0411 0.2028 2.9208 0.0035 0.1949 0.9900
# note: by default, the sampling variance is computed in a slightly different
# way in the book compared to the metafor package; by using vtype="LS2", the
# same equation as given in the book is used; but the difference is usually
# negligible (the results above differ slightly from those given in the book
# due to intermittent rounding in the book)
# standardized mean change (using raw score standardization)