General Notes / Setup

The Handbook of Research Synthesis and Meta-Analysis by Cooper et al. (2019), now in its third edition, has been one of the quintessential texts on meta-analysis and the entire research synthesis process as a whole. In this document, I provide the R code to reproduce the worked examples and analyses from various chapters. Emphasis will be on using the metafor package, but several other packages will also be used. To read more about the metafor package, see the package website and the package documentation.

The package can be installed with:

install.packages("metafor")

Once the package is installed, we can load it with:

library(metafor)

A few additional notes:

  1. Results are only reproduced for chapters containing worked examples.
  2. Occasionally, there are some minor discrepancies between the results shown in the book and those obtained below. These can result from using different software packages that implement methods in slightly different ways, due to intermittent rounding or using a different rounding scheme, or due to chance when the analyses involve some stochastic process (e.g., when using MCMC sampling in the chapter on Bayesian methods). Minor discrepancies will (usually) not be commented on. However, where discrepancies are more substantial, they will be noted (and the reasons for them if they are known).
  3. The results are generally given without discussion or context. The code below is not a substitute for reading the book, but is meant to be used together with it. In other words, readers of the book interested in replicating the results with R can see here how this is possible.

1) Research Synthesis as a Scientific Process

To recreate Figure 1.1 (showing the number of citations to articles including the terms ‘research synthesis’, ‘systematic review’, or ‘meta-analysis’ in their titles), I redid the search in the Web of Science Core Collection, which yielded the following data:

dat <- read.table(header=TRUE, text = "
year hits
2020 15973
2019 32120
2018 26293
2017 23885
2016 21318
2015 18487
2014 14781
2013 12357
2012 9802
2011 7528
2010 6120
2009 5121
2008 4006
2007 3553
2006 2771
2005 2336
2004 1911
2003 1526
2002 1309
2001 1005
2000 891
1999 832
1998 729
1997 580
1996 466
1995 70
1994 25
1993 11
1992 5
1991 9
1990 20
1989 127
1988 104
")
dat <- dat[-1,]               # remove current year (not complete)
dat <- dat[dat$year >= 1995,] # keep only 1995 or later
dat <- dat[nrow(dat):1,]      # reverse order

We can then create a bar chart based on these data:

# Figure 1.1

par(mar=c(4,4,2,2))
barplot(dat$hits, names.arg=dat$year, las=2, space=0.4, col="#6c9ece", border=NA)
abline(h=seq(0, 30000, by=5000), col="gray")
barplot(dat$hits, space=0.4, col="#6c9ece", border=NA, add=TRUE, axes=FALSE)


10) Evaluating Coding Decisions

Part of the code for this chapter (by Jack Vevea, Nicole Zelinksy, and Robert Orwin) is adapted from the chapter itself (see section 10.5). Below, we will make use of several additional packages that need to be installed (if they are not already installed). So let’s do this first.

# install the 'irr' package (if it is not already installed) and load it

if (!require(irr)) {
   install.packages("irr")
   library(irr)
}
# install the 'psych' package (if it is not already installed) and load it

if (!require(psych)) {
   install.packages("psych")
   library(psych)
}
# install the 'vcd' package (if it is not already installed) and load it

if (!require(vcd)) {
   install.packages("vcd")
   library(vcd)
}
# install the 'lme4' package (if it is not already installed) and load it

if (!require(lme4)) {
   install.packages("lme4")
   library(lme4)
}
# Table 10.1

dat <- read.table(header=TRUE, text = "
study c1 c2 c3
1  3 2 3
2  3 1 1
3  2 2 2
4  3 2 3
5  1 1 1
6  3 1 3
7  2 2 1
8  1 1 1
9  2 2 1
10 2 1 3
11 2 2 2
12 3 3 3
13 3 1 2
14 2 1 1
15 1 1 1
16 1 1 2
17 3 3 1
18 2 2 2
19 2 2 2
20 3 1 1
21 2 1 2
22 1 1 3
23 3 2 2
24 3 3 3
25 2 2 3")
# put ratings for coders 1, 2, and 3 into separate vectors

c1 <- dat$c1
c2 <- dat$c2
c3 <- dat$c3

# combine ratings for coders 1 and 2 into a matrix

c1c2 <- cbind(c1, c2)

# combine ratings from all three coders into a matrix

all3 <- cbind(c1, c2, c3)
# cell counts and marginal totals for coders 1 and 2

addmargins(table(c2, c1))
##      c1
## c2     1  2  3 Sum
##   1    5  3  4  12
##   2    0  7  3  10
##   3    0  0  3   3
##   Sum  5 10 10  25
# note: first variable is for rows, second is for columns, so to reproduce
# panel A of Table 10.2, we have to use table(c2, c1)
# agreement rate for coders 1 and 2

mean(c1 == c2)
## [1] 0.6
# agreement rate for all three coders

mean(c1 == c2 & c1 == c3)
## [1] 0.36
# agreement rate (in %) between coders 1 and 2

irr::agree(c1c2)
##  Percentage agreement (Tolerance=0)
## 
##  Subjects = 25 
##    Raters = 2 
##   %-agree = 60
# note: agree(c1c2) would have been sufficient, but due to the large number of
# additional packages being used, I will make it clear by using the :: operator
# which package a function belongs to (unless this is clear from the context)
# agreement rate (in %) between all three coders

irr::agree(all3)
##  Percentage agreement (Tolerance=0)
## 
##  Subjects = 25 
##    Raters = 3 
##   %-agree = 36
# unweighted Cohen's kappa for coders 1 and 2

irr::kappa2(c1c2)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 25 
##    Raters = 2 
##     Kappa = 0.425 
## 
##         z = 3.51 
##   p-value = 0.000455
# unweighted Cohen's kappa for all three coders

irr::kappam.fleiss(all3)
##  Fleiss' Kappa for m Raters
## 
##  Subjects = 25 
##    Raters = 3 
##     Kappa = 0.315 
## 
##         z = 3.85 
##   p-value = 0.00012
# weighted Cohen's kappa for coders 1 and 2

irr::kappa2(c1c2, weight=0:2)
##  Cohen's Kappa for 2 Raters (Weights: 0,1,2)
## 
##  Subjects = 25 
##    Raters = 2 
##     Kappa = 0.386 
## 
##         z = 3.22 
##   p-value = 0.00126

We can also use the psych package to compute Cohen’s kappa, which also provides corresponding confidence intervals.

# unweighted and weighted Cohen's kappa for coders 1 and 2

W <- outer(1:3, 1:3, FUN = function(x,y) abs(x-y)) # create weight matrix
W
##      [,1] [,2] [,3]
## [1,]    0    1    2
## [2,]    1    0    1
## [3,]    2    1    0
res <- psych::cohen.kappa(c1c2, w=W)
## Warning in cohen.kappa1(x, w = w, n.obs = n.obs, alpha = alpha, levels = levels): upper or
## lower confidence interval exceed abs(1) and set to +/- 1.
print(res, digits=3)
## Call: cohen.kappa1(x = x, w = w, n.obs = n.obs, alpha = alpha, levels = levels)
## 
## Cohen Kappa and Weighted Kappa correlation coefficients and confidence boundaries 
##                   lower estimate upper
## unweighted kappa  0.185    0.425 0.666
## weighted kappa   -1.000    0.386 1.000
## 
##  Number of subjects = 25

Note that the CI for weighted kappa is not correct! Using the vcd package, we can also compute Cohen’s kappa and obtain the correct CI for weighted kappa.

print(vcd::Kappa(table(c1,c2)), digits=3, CI=TRUE)
##            value   ASE    z Pr(>|z|) lower upper
## Unweighted 0.425 0.123 3.46 0.000532 0.185 0.666
## Weighted   0.386 0.128 3.02 0.002558 0.135 0.637
# note: the (default) weighting scheme used for computing weighted kappa by
# the function is the one described in the chapter
# Krippendorff's alpha for coders 1 and 2 when treating the data as ratings on
# a nominal, on an ordinal, or on a ratio scale

irr::kripp.alpha(t(c1c2))
##  Krippendorff's alpha
## 
##  Subjects = 25 
##    Raters = 2 
##     alpha = 0.403
irr::kripp.alpha(t(c1c2), method="ordinal")
##  Krippendorff's alpha
## 
##  Subjects = 25 
##    Raters = 2 
##     alpha = 0.278
irr::kripp.alpha(t(c1c2), method="ratio")
##  Krippendorff's alpha
## 
##  Subjects = 25 
##    Raters = 2 
##     alpha = 0.311
# correlation between coders 1 and 2

cor(c1, c2)
## [1] 0.4520228
# note: the cor() function is part of the 'stats' package, which comes with R
# mean correlation between all pairs of coders

irr::meancor(all3)
##  Mean of bivariate correlations R
## 
##  Subjects = 25 
##    Raters = 3 
##         R = 0.331 
## 
##         z = 1.55 
##   p-value = 0.121
# intraclass correlation coefficient for coders 1 and 2

psych::ICC(c1c2)
## Call: psych::ICC(x = c1c2)
## 
## Intraclass correlation coefficients 
##                          type  ICC   F df1 df2     p lower bound upper bound
## Single_raters_absolute   ICC1 0.28 1.8  24  25 0.082      -0.118        0.60
## Single_random_raters     ICC2 0.35 2.6  24  24 0.010      -0.025        0.65
## Single_fixed_raters      ICC3 0.45 2.6  24  24 0.010       0.075        0.71
## Average_raters_absolute ICC1k 0.43 1.8  24  25 0.082      -0.268        0.75
## Average_random_raters   ICC2k 0.52 2.6  24  24 0.010      -0.050        0.79
## Average_fixed_raters    ICC3k 0.62 2.6  24  24 0.010       0.140        0.83
## 
##  Number of subjects = 25     Number of Judges =  2
## See the help file for a discussion of the other 4 McGraw and Wong estimates,
# note: this function computes 6 different types of ICCs; the first three are
# discussed in the chapter and correspond to the three different designs
# described on page 187

Using the lmer() function from the lme4 package, we can also do these calculations manually.

# restructure data into 'long' format

dat <- data.frame(study = 1:25,
                  rater = rep(1:2, each=25),
                  rating = c(c1,c2))

# absolute agreement based on one-way random-effects model

res <- lmer(rating ~ (1 | study), data = dat)
vcs <- data.frame(VarCorr(res))
vcs$vcov[1] / (vcs$vcov[1] + vcs$vcov[2])
## [1] 0.2777018
# absolute agreement based on two-way random-effects model

res <- lmer(rating ~ (1 | study) + (1 | rater), data = dat)
vcs <- data.frame(VarCorr(res))
vcs$vcov[1] / (vcs$vcov[1] + vcs$vcov[2] + vcs$vcov[3])
## [1] 0.3545258
# absolute agreement based on two-way mixed-effects model

res <- lmer(rating ~ rater + (1 | study), data = dat)
vcs <- data.frame(VarCorr(res))
vcs$vcov[1] / (vcs$vcov[1] + vcs$vcov[2])
## [1] 0.4503106
# example data from page 199

dat <- data.frame(
   study = 1:25,
   rater = rep(1:3, each=25),
   rating = c(3,3,2,3,NA,3,2,1,2,2,NA,3,3,2,1,1,3,2,2,3,2,1,3,NA,2,
              2,1,NA,2,1,1,2,1,2,1,2,3,1,1,NA,1,3,2,2,1,1,1,2,3,2,
              3,1,2,3,1,3,1,1,NA,3,2,3,2,1,1,2,1,2,2,1,2,3,2,3,3))
dat[c(1:4, 71:75),]
##    study rater rating
## 1      1     1      3
## 2      2     1      3
## 3      3     1      2
## 4      4     1      3
## 71    21     3      2
## 72    22     3      3
## 73    23     3      2
## 74    24     3      3
## 75    25     3      3
# absolute agreement for all three raters (based on one-way random-effects model)

res <- lmer(rating ~ (1 | study), data = dat)
vcs <- data.frame(VarCorr(res))
vcs$vcov[1] / (vcs$vcov[1] + vcs$vcov[2])
## [1] 0.1794355

11) Effect Sizes for Meta-Analysis

# data for Figure 11.1

dat <- read.table(header=TRUE, text = "
study    md   n   var    se  pval
A     0.400  60 0.067 0.258 0.121
B     0.200 600 0.007 0.082 0.014
C     0.300 100 0.040 0.201 0.134
D     0.400 200 0.020 0.141 0.005
E     0.300 400 0.010 0.100 0.003
F    -0.200 200 0.020 0.141 0.157")
dat
##   study   md   n   var    se  pval
## 1     A  0.4  60 0.067 0.258 0.121
## 2     B  0.2 600 0.007 0.082 0.014
## 3     C  0.3 100 0.040 0.201 0.134
## 4     D  0.4 200 0.020 0.141 0.005
## 5     E  0.3 400 0.010 0.100 0.003
## 6     F -0.2 200 0.020 0.141 0.157
# Figure 11.1

res <- rma(md, var, data=dat, method="EE", slab=study)

tmp <- dat[-1]
tmp$se  <- fmtx(tmp$se,  digits=3)
tmp$var <- fmtx(tmp$var, digits=3)

size <- sqrt(weights(res))
size <- 2.5 * size / max(size)

par(mar=c(5,4,2,2))

forest(res, xlim=c(-6.5,1.4), psize=size, header=TRUE, mlab="Combined",
       efac=c(0,1,2), annotate=FALSE, xlab="Standardized Mean Difference",
       ilab=tmp, ilab.xpos=c(-5.0, -4.1, -3.2, -2.3, -1.4))
text(-5.0, 8, "Mean\nDifference", font=2)
text(-4.1, 8, "Sample\nSize", font=2)
text(-3.2, 8, "Variance", font=2)
text(-2.3, 8, "Standard\nError", font=2)
text(-1.4, 8, "p-Value", font=2)

Effect Sizes for a Comparison of Means

# mean difference assuming sigma^2_1 = sigma^2_1

dat <- escalc("MD", m1i=103, m2i=100, sd1i=5.5, sd2i=4.5, n1i=50, n2i=50, vtype="HO")
summary(dat) # note: summary() so we can also see the standard error (sei)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 3.0000 1.0100 1.0050 2.9851 0.0028 1.0303 4.9697
# mean difference not assuming sigma^2_1 = sigma^2_1

dat <- escalc("MD", m1i=103, m2i=100, sd1i=5.5, sd2i=4.5, n1i=50, n2i=50)
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 3.0000 1.0100 1.0050 2.9851 0.0028 1.0303 4.9697
# note: since n1i=n2i in this example, the results are exactly the same
# mean change

dat <- escalc("MC", m1i=105, m2i=100, sd1i=10, sd2i=10, ni=50, ri=0.5)
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 5.0000 2.0000 1.4142 3.5355 0.0004 2.2282 7.7718
# standardized mean difference (Hedges' g)

dat <- escalc("SMD", m1i=103, m2i=100, sd1i=5.5, sd2i=4.5, n1i=50, n2i=50)
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 0.5924 0.0418 0.2043 2.8993 0.0037 0.1919 0.9929
# note: by default, the sampling variance is computed in a slightly different way in the
# book compared to the metafor package; by using vtype="LS2", the exact same equation as
# given in the book is used; but the difference is usually negligible (the results below
# differ slightly from those given in the book due to intermittent rounding in the book)

dat <- escalc("SMD", m1i=103, m2i=100, sd1i=5.5, sd2i=4.5, n1i=50, n2i=50, vtype="LS2")
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 0.5924 0.0411 0.2028 2.9208 0.0035 0.1949 0.9900
# standardized mean change (using raw score standardization)

dat <- escalc("SMCR", m1i=103, m2i=100, sd1i=5.5/sqrt(2*(1-0.7)), ni=50, ri=0.7)
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 0.4160 0.0137 0.1172 3.5502 0.0004 0.1863 0.6457
# note: 5.5 is the SD of the change scores, which we can convert to the SD of
# the raw scores by dividing it by sqrt(2*(1-r)) (this assumes that the SD was
# the same at the pre- and post-test)
# note: by default, the sampling variance is computed in a slightly different way in the
# book compared to the metafor package; by using vtype="LS2", the exact same equation as
# given in the book is used; but the difference is usually negligible (the results below
# differ slightly from those given in the book due to intermittent rounding in the book
# and also because the equation given in the book contains a mistake)

dat <- escalc("SMCR", m1i=103, m2i=100, sd1i=5.5/sqrt(2*(1-0.7)), ni=50, ri=0.7, vtype="LS2")
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 0.4160 0.0134 0.1156 3.5986 0.0003 0.1894 0.6426
# standardized mean difference based on ANCOVA results

# note: not implemented in metafor, so we have to do the computations manually

Sw <- 5.5 / sqrt(1 - 0.7^2)
d  <- (103 - 100) / Sw
Vd <- (50 + 50) * (1 - 0.7^2) / (50 * 50) + d^2 / (2*(50 + 50 - 2 - 1))
J  <- metafor:::.cmicalc(50 + 50 - 2 - 1)
g  <- J * d
Vg <- J^2 * Vd
round(g,  digits=4)
## [1] 0.3865
round(Vg, digits=4)
## [1] 0.0209
round(sqrt(Vg), digits=4)
## [1] 0.1444
# note: the results given in the book are not quite correct

Correlations

# r-to-z transformed correlation coefficient

dat <- escalc("ZCOR", ri=0.50, ni=100)
summary(dat)
##       yi     vi    sei     zi   pval  ci.lb  ci.ub 
## 1 0.5493 0.0103 0.1015 5.4100 <.0001 0.3503 0.7483
# back-transformation

c(transf.ztor(dat$yi))
## [1] 0.5

Effect Sizes for Comparing Risks

# risk difference

dat <- escalc("RD", ai=5, n1i=100, ci=10, n2i=100)
summary(dat)
##        yi     vi    sei      zi   pval   ci.lb  ci.ub 
## 1 -0.0500 0.0014 0.0371 -1.3484 0.1775 -0.1227 0.0227
# risk ratio (log transformed)

dat <- escalc("RR", ai=5, n1i=100, ci=10, n2i=100)
summary(dat)
##        yi     vi    sei      zi   pval   ci.lb  ci.ub 
## 1 -0.6931 0.2800 0.5292 -1.3099 0.1902 -1.7303 0.3440
# odds ratio (log transformed)

dat <- escalc("OR", ai=5, n1i=100, ci=10, n2i=100)
summary(dat)
##        yi     vi    sei      zi   pval   ci.lb  ci.ub 
## 1 -0.7472 0.3216 0.5671 -1.3175 0.1877 -1.8588 0.3643
# odds ratio (log transformed) for a case-control study

dat <- escalc("OR", ai=25, bi=20, ci=75, di=80)
summary(dat)
##       yi     vi    sei     zi   pval   ci.lb  ci.ub 
## 1 0.2877 0.1158 0.3403 0.8453 0.3980 -0.3794 0.9547

12) Statistically Analyzing Effect Sizes: Equal- and Random-Effects Models

# Table 12.1: Data for the Gender Differences in Conformity Example

dat <- read.table(header=TRUE, text = "
study group stdingrp nitems pmaleauth n d v
1  1 1  2 141 25  -0.330 0.029
2  1 2  2 119 25   0.070 0.034
3  2 1  2 191 50  -0.300 0.022
4  3 1 38 254 100  0.350 0.016
5  3 2 30  64 100  0.700 0.066
6  3 3 45  20 100  0.850 0.218
7  3 4 45  90 100  0.400 0.045
8  3 5 45  60 100  0.480 0.069
9  3 6  5  80 100  0.370 0.051
10 3 7  5 125 100 -0.060 0.032")

# note: including the 'percent male authors' variable from Table 12.3

dat
##    study group stdingrp nitems pmaleauth   n     d     v
## 1      1     1        1      2       141  25 -0.33 0.029
## 2      2     1        2      2       119  25  0.07 0.034
## 3      3     2        1      2       191  50 -0.30 0.022
## 4      4     3        1     38       254 100  0.35 0.016
## 5      5     3        2     30        64 100  0.70 0.066
## 6      6     3        3     45        20 100  0.85 0.218
## 7      7     3        4     45        90 100  0.40 0.045
## 8      8     3        5     45        60 100  0.48 0.069
## 9      9     3        6      5        80 100  0.37 0.051
## 10    10     3        7      5       125 100 -0.06 0.032
# equal-effects model analysis

res <- rma(d, v, data=dat, method="EE")
print(res, digits=3)
## Equal-Effects Model (k = 10)
## 
## I^2 (total heterogeneity / total variability):   71.68%
## H^2 (total variability / sampling variability):  3.53
## 
## Test for Heterogeneity:
## Q(df = 9) = 31.776, p-val < .001
## 
## Model Results:
## 
## estimate     se   zval   pval  ci.lb  ci.ub    
##    0.124  0.060  2.074  0.038  0.007  0.241  * 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note: The book uses the term ‘fixed-effects model’, but I will use the term ‘equal-effects model’ throughout this document. My reasons for using the latter term are explained here.

# random-effects model analysis

res <- rma(d, v, data=dat, method="DL")
print(res, digits=3)
## Random-Effects Model (k = 10; tau^2 estimator: DL)
## 
## tau^2 (estimated amount of total heterogeneity): 0.094 (SE = 0.066)
## tau (square root of estimated tau^2 value):      0.307
## I^2 (total heterogeneity / total variability):   71.68%
## H^2 (total variability / sampling variability):  3.53
## 
## Test for Heterogeneity:
## Q(df = 9) = 31.776, p-val < .001
## 
## Model Results:
## 
## estimate     se   zval   pval   ci.lb  ci.ub    
##    0.189  0.118  1.594  0.111  -0.043  0.421    
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# note: unfortunately, the estimate of tau^2 was not computed correctly in the
# book (c=242.1138, not 269.798) and hence all of the results given in the left
# column on page 251 are incorrect
# fixed-effects ANOVA-type analysis

res <- rma(d, v, mods = ~ factor(group) - 1, data=dat, method="FE")
print(res, digits=3)
## Fixed-Effects with Moderators Model (k = 10)
## 
## I^2 (residual heterogeneity / unaccounted variability): 36.85%
## H^2 (unaccounted variability / sampling variability):   1.58
## 
## Test for Residual Heterogeneity:
## QE(df = 7) = 11.084, p-val = 0.135
## 
## Test of Moderators (coefficients 1:3):
## QM(df = 3) = 24.992, p-val < .001
## 
## Model Results:
## 
##                 estimate     se    zval   pval   ci.lb   ci.ub      
## factor(group)1    -0.146  0.125  -1.166  0.244  -0.391   0.099      
## factor(group)2    -0.300  0.148  -2.023  0.043  -0.591  -0.009    * 
## factor(group)3     0.339  0.077   4.421  <.001   0.189   0.490  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# note: by removing the intercept, the three coefficients directly provide the
# estimated average effect for the three groups
# weighted grand mean effect size

rma(coef(res), diag(vcov(res)), method="FE", digits=3)
## Fixed-Effects Model (k = 3)
## 
## I^2 (total heterogeneity / total variability):   90.33%
## H^2 (total variability / sampling variability):  10.35
## 
## Test for Heterogeneity:
## Q(df = 2) = 20.692, p-val < .001
## 
## Model Results:
## 
## estimate     se   zval   pval  ci.lb  ci.ub    
##    0.124  0.060  2.074  0.038  0.007  0.241  * 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# partitioning of the Q-statistics

res <- rma(d, v, mods = ~ factor(group), data=dat, method="FE")
res
## Fixed-Effects with Moderators Model (k = 10)
## 
## I^2 (residual heterogeneity / unaccounted variability): 36.85%
## H^2 (unaccounted variability / sampling variability):   1.58
## R^2 (amount of heterogeneity accounted for):            55.15%
## 
## Test for Residual Heterogeneity:
## QE(df = 7) = 11.0843, p-val = 0.1350
## 
## Test of Moderators (coefficients 2:3):
## QM(df = 2) = 20.6916, p-val < .0001
## 
## Model Results:
## 
##                 estimate      se     zval    pval    ci.lb   ci.ub      
## intrcpt          -0.1459  0.1251  -1.1660  0.2436  -0.3911  0.0993      
## factor(group)2   -0.1541  0.1940  -0.7943  0.4270  -0.5344  0.2262      
## factor(group)3    0.4851  0.1468   3.3053  0.0009   0.1975  0.7728  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# not removing the intercept, so the QM-statistic is equal to Q-between

round(res$QM, digits=3) # Q-between
## [1] 20.692
round(res$QE, digits=3) # Q-within
## [1] 11.084
# Q-within for each group

res1 <- rma(d, v, data=dat, method="FE", subset=group==1)
res2 <- rma(d, v, data=dat, method="FE", subset=group==2)
res3 <- rma(d, v, data=dat, method="FE", subset=group==3)

round(res1$QE, digits=3)
## [1] 2.54
round(res2$QE, digits=3) # 0.0004 in the book, but must be exactly 0 since k=1
## [1] 0
round(res3$QE, digits=3)
## [1] 8.545
# these add up to Q-within above

round(res1$QE + res2$QE + res3$QE, digits=3)
## [1] 11.084
# contrast between group 1 and 3

res <- rma(d, v, mods = ~ factor(group) - 1, data=dat, method="FE")
anova(res, X=c(-1,0,1), digits=3)
## Hypothesis:                                        
## 1: -factor(group)1 + factor(group)3 = 0 
## 
## Results:
##    estimate    se  zval  pval 
## 1:    0.485 0.147 3.305 <.001 
## 
## Test of Hypothesis:
## QM(df = 1) = 10.925, p-val < .001
predict(res, newmods=c(-1,0,1), digits=3)
##   pred    se ci.lb ci.ub 
##  0.485 0.147 0.197 0.773
# note: the results given in the book are slightly off
# mixed-effects model analysis

# distribution-free (method of moments) estimate of tau^2

res <- rma(d, v, mods = ~ factor(group), data=dat, method="DL")
round(res$tau2, digits=3)
## [1] 0.025
# maximum likelihood estimate of tau^2

res <- rma(d, v, mods = ~ factor(group), data=dat, method="ML")
round(res$tau2, digits=3)
## [1] 0
# restricted maximum likelihood estimate of tau^2

res <- rma(d, v, mods = ~ factor(group), data=dat, method="REML")
round(res$tau2, digits=3)
## [1] 0.027
# note: the REML estimate is incorrectly claimed to be 0 in the book
res <- rma(d, v, mods = ~ factor(group) - 1, data=dat, method="DL")
print(res, digits=3)
## Mixed-Effects Model (k = 10; tau^2 estimator: DL)
## 
## tau^2 (estimated amount of residual heterogeneity):     0.025 (SE = 0.037)
## tau (square root of estimated tau^2 value):             0.157
## I^2 (residual heterogeneity / unaccounted variability): 36.85%
## H^2 (unaccounted variability / sampling variability):   1.58
## 
## Test for Residual Heterogeneity:
## QE(df = 7) = 11.084, p-val = 0.135
## 
## Test of Moderators (coefficients 1:3):
## QM(df = 3) = 15.062, p-val = 0.002
## 
## Model Results:
## 
##                 estimate     se    zval   pval   ci.lb  ci.ub      
## factor(group)1    -0.139  0.168  -0.829  0.407  -0.467  0.190      
## factor(group)2    -0.300  0.216  -1.387  0.165  -0.724  0.124      
## factor(group)3     0.361  0.102   3.528  <.001   0.161  0.562  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# note: by removing the intercept, the three coefficients directly provide the
# estimated average effect for the three groups

# weighted grand mean effect size

rma(coef(res), diag(vcov(res)), method="FE", digits=3)
## Fixed-Effects Model (k = 3)
## 
## I^2 (total heterogeneity / total variability):   82.69%
## H^2 (total variability / sampling variability):  5.78
## 
## Test for Heterogeneity:
## Q(df = 2) = 11.557, p-val = 0.003
## 
## Model Results:
## 
## estimate     se   zval   pval   ci.lb  ci.ub    
##    0.152  0.081  1.872  0.061  -0.007  0.310  . 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# contrast between group 1 and 3

anova(res, X=c(-1,0,1), digits=3)
## Hypothesis:                                        
## 1: -factor(group)1 + factor(group)3 = 0 
## 
## Results:
##    estimate    se  zval  pval 
## 1:    0.500 0.196 2.547 0.011 
## 
## Test of Hypothesis:
## QM(df = 1) = 6.485, p-val = 0.011
predict(res, newmods=c(-1,0,1), digits=3)
##   pred    se ci.lb ci.ub pi.lb pi.ub 
##  0.500 0.196 0.115 0.885 0.007 0.993
# meta-regression model

res <- rma(d, v, mods = ~ I(log(nitems)), data=dat, method="FE")
print(res, digits=3)
## Fixed-Effects with Moderators Model (k = 10)
## 
## I^2 (residual heterogeneity / unaccounted variability): 7.66%
## H^2 (unaccounted variability / sampling variability):   1.08
## R^2 (amount of heterogeneity accounted for):            69.33%
## 
## Test for Residual Heterogeneity:
## QE(df = 8) = 8.664, p-val = 0.371
## 
## Test of Moderators (coefficient 2):
## QM(df = 1) = 23.112, p-val < .001
## 
## Model Results:
## 
##                 estimate     se    zval   pval   ci.lb   ci.ub      
## intrcpt           -0.323  0.111  -2.922  0.003  -0.540  -0.106   ** 
## I(log(nitems))     0.210  0.044   4.808  <.001   0.125   0.296  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# note: when doing transformations on predictors (such as taking the log) in
# the model formula, then we need to wrap this inside the I() function
# mixed-effects meta-regression model

# distribution-free (method of moments) estimate of tau^2

res <- rma(d, v, mods = ~ I(log(nitems)), data=dat, method="DL")
round(res$tau2, digits=3)
## [1] 0.003
# maximum likelihood estimate of tau^2

res <- rma(d, v, mods = ~ I(log(nitems)), data=dat, method="ML")
round(res$tau2, digits=3)
## [1] 0
# restricted maximum likelihood estimate of tau^2

res <- rma(d, v, mods = ~ I(log(nitems)), data=dat, method="REML")
round(res$tau2, digits=3)
## [1] 0.004
# note: the REML estimate is incorrectly claimed to be 0 in the book
# continuing with the distribution-free (method of moments) estimate of tau^2

res <- rma(d, v, mods = ~ I(log(nitems)), data=dat, method="DL")
print(res, digits=3)
## Mixed-Effects Model (k = 10; tau^2 estimator: DL)
## 
## tau^2 (estimated amount of residual heterogeneity):     0.003 (SE = 0.021)
## tau (square root of estimated tau^2 value):             0.057
## I^2 (residual heterogeneity / unaccounted variability): 7.66%
## H^2 (unaccounted variability / sampling variability):   1.08
## R^2 (amount of heterogeneity accounted for):            96.53%
## 
## Test for Residual Heterogeneity:
## QE(df = 8) = 8.664, p-val = 0.371
## 
## Test of Moderators (coefficient 2):
## QM(df = 1) = 20.821, p-val < .001
## 
## Model Results:
## 
##                 estimate     se    zval   pval   ci.lb   ci.ub      
## intrcpt           -0.321  0.117  -2.744  0.006  -0.550  -0.092   ** 
## I(log(nitems))     0.212  0.046   4.563  <.001   0.121   0.303  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# robust variance estimation

robust(res, cluster=study, digits=3)
## Mixed-Effects Model (k = 10; tau^2 estimator: DL)
## 
## tau^2 (estimated amount of residual heterogeneity):     0.003 (SE = 0.021)
## tau (square root of estimated tau^2 value):             0.057
## I^2 (residual heterogeneity / unaccounted variability): 7.66%
## H^2 (unaccounted variability / sampling variability):   1.08
## R^2 (amount of heterogeneity accounted for):            96.53%
## 
## Test for Residual Heterogeneity:
## QE(df = 8) = 8.664, p-val = 0.371
## 
## Number of estimates:   10
## Number of clusters:    10
## Estimates per cluster: 1
## 
## Test of Moderators (coefficient 2):¹
## F(df1 = 1, df2 = 8) = 25.315, p-val = 0.001
## 
## Model Results:
## 
##                 estimate     se¹    tval¹  df¹   pval¹   ci.lb¹   ci.ub¹     
## intrcpt           -0.321  0.124   -2.586    8   0.032   -0.607   -0.035    * 
## I(log(nitems))     0.212  0.042    5.031    8   0.001    0.115    0.309   ** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 1) results based on cluster-robust inference (var-cov estimator: CR1,
##    approx t/F-tests and confidence intervals, df: residual method)
# note: the test statistic for log(nitems) is somewhat off in the book

13) Stochastically Dependent Effect Sizes

# Table 13.1

dat <- dat.kalaian1996[c(1:8,17:29,9:16,30:31),c(2,4:5,7,6,8)]
dat$study <- rep(1:26, times=rle(dat$study)$lengths)
names(dat) <- c("study", "nt", "nc", "d", "outcome", "v")
dat
##    study  nt  nc     d outcome      v
## 1      1  28  22  0.22  verbal 0.0817
## 2      2  39  40  0.09  verbal 0.0507
## 3      3  22  17  0.14  verbal 0.1045
## 4      4  48  43  0.14  verbal 0.0442
## 5      5  25  74 -0.01  verbal 0.0535
## 6      6  37  35  0.14  verbal 0.0557
## 7      7  24  70  0.18  verbal 0.0561
## 8      8  16  19  0.01  verbal 0.1151
## 17     9  43  37  0.01  verbal 0.0503
## 18    10  19  13  0.67  verbal 0.1366
## 19    11  16  11 -0.38  verbal 0.1561
## 20    12  20  12 -0.24  verbal 0.1342
## 21    13  39  28  0.29  verbal 0.0620
## 22    14  38  25  0.26    math 0.0669
## 23    15  18  13 -0.41    math 0.1352
## 24    16  19  13  0.08    math 0.1297
## 25    17  37  22  0.30    math 0.0732
## 26    18  19  11 -0.53    math 0.1482
## 27    19  17  13  0.13    math 0.1360
## 28    20  20  12  0.26    math 0.1344
## 29    21  20  13  0.47    math 0.1303
## 9     22 145 129  0.13  verbal 0.0147
## 10    22 145 129  0.12    math 0.0147
## 11    23  72 129  0.25  verbal 0.0218
## 12    23  72 129  0.06    math 0.0216
## 13    24  71 129  0.31  verbal 0.0221
## 14    24  71 129  0.09    math 0.0219
## 15    25  13  14  0.00  verbal 0.1484
## 16    25  13  14  0.07    math 0.1484
## 30    26  16  17  0.13  verbal 0.1216
## 31    26  16  17  0.48    math 0.1248
# note: this is a subset of the studies in dat.kalaian1996; instead of using
# dummy variable 'x' (where x=0 for the verbal and x=1 for the math subtest),
# we use the 'outcome' variable that has more descriptive labels
# construct variance-covariance matrices assuming rho = 0.7

V <- vcalc(v, cluster=study, type=outcome, rho=0.7, data=dat)

# var-cov matrix for studies 22 to 26

blsplit(V, cluster=dat$study, round, 4)[22:26]
## $`22`
##        [,1]   [,2]
## [1,] 0.0147 0.0103
## [2,] 0.0103 0.0147
## 
## $`23`
##        [,1]   [,2]
## [1,] 0.0218 0.0152
## [2,] 0.0152 0.0216
## 
## $`24`
##        [,1]   [,2]
## [1,] 0.0221 0.0154
## [2,] 0.0154 0.0219
## 
## $`25`
##        [,1]   [,2]
## [1,] 0.1484 0.1039
## [2,] 0.1039 0.1484
## 
## $`26`
##        [,1]   [,2]
## [1,] 0.1216 0.0862
## [2,] 0.0862 0.1248
# multivariate model with (correlated) random effects for both outcomes

res <- rma.mv(d, V, mods = ~ outcome - 1,
              random = ~ outcome | study, struct="UN", data=dat)
print(res, digits=3)
## Multivariate Meta-Analysis Model (k = 31; method: REML)
## 
## Variance Components:
## 
## outer factor: study   (nlvls = 26)
## inner factor: outcome (nlvls = 2)
## 
##            estim   sqrt  k.lvl  fixed   level 
## tau^2.1    0.001  0.025     13     no    math 
## tau^2.2    0.000  0.021     18     no  verbal 
## 
##         rho.math  rho.vrbl    math  vrbl 
## math           1                 -     5 
## verbal    -1.000         1      no     - 
## 
## Test for Residual Heterogeneity:
## QE(df = 29) = 19.902, p-val = 0.896
## 
## Test of Moderators (coefficients 1:2):
## QM(df = 2) = 9.417, p-val = 0.009
## 
## Model Results:
## 
##                estimate     se   zval   pval   ci.lb  ci.ub     
## outcomemath       0.089  0.059  1.511  0.131  -0.026  0.204     
## outcomeverbal     0.158  0.051  3.067  0.002   0.057  0.258  ** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# note: by removing the intercept, the two coefficients directly provide the
# estimated average effect for the two outcomes

# note: both variance components are very close to 0; as in the book, we
# proceed with a model where both variances are constrained to be equal to 0
# multivariate fixed-effects model

res <- rma.mv(d, V,