vcalc.Rd
Function to construct or approximate the variance-covariance matrix of dependent effect sizes or outcomes, or more precisely, of their sampling errors (i.e., the V
matrix in rma.mv
).
vcalc(vi, cluster, subgroup, obs, type, time1, time2, grp1, grp2, w1, w2,
data, rho, phi, rvars, checkpd=TRUE, nearpd=FALSE, sparse=FALSE, ...)
numeric vector to specify the sampling variances of the observed effect sizes or outcomes.
vector to specify the clustering variable (e.g., study).
optional vector to specify different (independent) subgroups within clusters.
optional vector to distinguish different observed effect sizes or outcomes corresponding to the same construct or response/dependent variable.
optional vector to distinguish different types of constructs or response/dependent variables underlying the observed effect sizes or outcomes.
optional numeric vector to specify the time points when the observed effect sizes or outcomes were obtained (in the first condition if the observed effect sizes or outcomes represent contrasts between two conditions).
optional numeric vector to specify the time points when the observed effect sizes or outcomes were obtained in the second condition (only relevant when the observed effect sizes or outcomes represent contrasts between two conditions).
optional vector to specify the group of the first condition when the observed effect sizes or outcomes represent contrasts between two conditions.
optional vector to specify the group of the second condition when the observed effect sizes or outcomes represent contrasts between two conditions.
optional numeric vector to specify the size of the group (or more generally, the inverse-sampling variance weight) of the first condition when the observed effect sizes or outcomes represent contrasts between two conditions.
optional numeric vector to specify the size of the group (or more generally, the inverse-sampling variance weight) of the second condition when the observed effect sizes or outcomes represent contrasts between two conditions.
optional data frame containing the variables given to the arguments above.
argument to specify the correlation(s) of observed effect sizes or outcomes measured concurrently. See ‘Details’.
argument to specify the autocorrelation of observed effect sizes or outcomes measured at different time points. See ‘Details’.
optional argument for specifying the variables that correspond to the correlation matrices of the studies (if this is specified, all arguments above except for cluster
and subgroup
are ignored). See ‘Details’.
logical to specify whether to check that the variance-covariance matrices within clusters are positive definite (the default is TRUE
). See ‘Note’.
logical to specify whether the nearPD
function from the Matrix package should be used on variance-covariance matrices that are not positive definite. See ‘Note’.
logical to specify whether the variance-covariance matrix should be returned as a sparse matrix (the default is FALSE
).
other arguments.
Standard meta-analytic models (such as those that can be fitted with the rma.uni
function) assume that the observed effect sizes or outcomes (or more precisely, their sampling errors) are independent. This assumption is typically violated whenever multiple observed effect sizes or outcomes are computed based on the same sample of subjects (or whatever the study units are) or if there is at least partial overlap of subjects that contribute information to the computation of multiple effect sizes or outcomes.
The present function can be used to construct or approximate the variance-covariance matrix of the sampling errors of dependent effect sizes or outcomes for a wide variety of circumstances (this variance-covariance matrix is the so-called V
matrix that may be needed as input for multilevel/multivariate meta-analytic models as can be fitted with the rma.mv
function; see also here for some recommendations on a general workflow for meta-analyses involving complex dependency structures).
Argument cluster
is used to specify the clustering variable. Rows with the same value of this variable are allowed to be dependent, while rows with different values are assumed to be independent. Typically, cluster
will be a study identifier.
Within the same cluster, there may be different subgroups with no overlap of subjects across subgroups. Argument subgroup
can be used to distinguish such subgroups. Rows with the same value of this variable within a cluster are allowed to be dependent, while rows with different values are assumed to be independent even if they come from the same cluster. Therefore, from hereon, ‘cluster’ really refers to the combination of cluster
and subgroup
.
Multiple effect sizes or outcomes belonging to the same cluster may be dependent due to a variety of reasons:
The same construct of interest (e.g., severity of depression) may have been measured using different scales or instruments within a study (e.g., using the Beck Depression Inventory (BDI) and the Hamilton Depression Rating Scale (HDRS)) based on which multiple effect sizes can be computed for the same group of subjects (e.g., contrasting a treatment versus a control group with respect to each scale). In this case, we have multiple effect sizes that are different ‘observations’ of the effect with respect to the same type of construct.
Argument obs
is then used to distinguish different effect sizes corresponding to the same construct. If obs
is specified, then argument rho
must also be used to specify the degree of correlation among the sampling errors of the different effect sizes. Since this correlation is typically not known, the correlation among the various scales (or a rough ‘guestimate’ thereof) can be used as a proxy (i.e., the (typical) correlation between BDI and HDRS measurements).
One can also pass an entire correlation matrix via rho
to specify, for each possible pair of obs
values, the corresponding correlation. The row/column names of the matrix must then correspond to the unique values of the obs
variable.
Multiple types of constructs (or more generally, types of response/dependent variables) may have been measured in the same group of subjects (e.g., severity of depression as measured with the Beck Depression Inventory (BDI) and severity of anxiety as measured with the State-Trait Anxiety Inventory (STAI)). If this is of interest for a meta-analysis, effect sizes can then be computed with respect to each ‘type’ of construct.
Argument type
is then used to distinguish effect sizes corresponding to these different types of constructs. If type
is specified, then argument rho
must also be used to specify the degree of correlation among the sampling errors of effect sizes belonging to these different types. As above, the correlation among the various scales is typically used here as a proxy (i.e., the (typical) correlation between BDI and STAI measurements).
One can also pass an entire correlation matrix via rho
to specify, for each possible pair of type
values, the corresponding correlation. The row/column names of the matrix must then correspond to the unique values of the type
variable.
If there are multiple types of constructs, multiple scales or instruments may also have been used (in at least some of the studies) to measure the same construct and hence there may again be multiple effect sizes that are ‘observations’ of the same type of construct. Arguments type
and obs
should then be used together to specify the various construct types and observations thereof. In this case, argument rho
must be a vector of two values, the first to specify the within-construct correlation and the second to specify the between-construct correlation.
One can also specify a list with two elements for rho
, the first element being either a scalar or an entire correlation matrix for the within-construct correlation(s) and the second element being a scalar or an entire correlation matrix for the between-construct correlation(s). As above, any matrices specified must have row/column names corresponding to the unique values of the obs
and/or type
variables.
The same construct and scale may have been assessed/used multiple times, allowing the computation of multiple effect sizes for the same group of subjects at different time points (e.g., right after the end of a treatment, at a short-term follow-up, and at a long-term follow-up). Argument time1
is then used to specify the time points when the observed effect sizes were obtained. Argument phi
must then also be used to specify the autocorrelation among the sampling errors of two effect sizes that differ by one unit on the time1
variable. As above, the autocorrelation of the measurements themselves can be used here as a proxy.
If multiple constructs and/or multiple scales have also been assessed at the various time points, then arguments type
and/or obs
(together with argument rho
) should be used as needed to differentiate effect sizes corresponding to the different constructs and/or scales.
Many effect sizes or outcome measures (e.g., raw or standardized mean differences, log-transformed ratios of means, log risk/odds ratios and risk differences) reflect the difference between two conditions (i.e., a contrast). Within a study, there may be more than two conditions, allowing the computation of multiple such contrasts (e.g., treatment A versus a control condition and treatment B versus the same control condition) and hence corresponding effect sizes. The reuse of information from the ‘shared’ condition (in this example, the control condition) then induces correlation among the effect sizes.
To account for this, arguments grp1
and grp2
should be used to specify (within each cluster) which two groups were compared in the computation of each effect size (e.g., in the example above, the coding could be grp1=c(2,3)
and grp2=c(1,1)
; whether numbers or strings are used as identifiers is irrelevant).
The degree of correlation between two contrast-type effect sizes that is induced by the use of a shared condition is a function of the size of the groups involved in the computation of the two effect sizes (or, more generally, the inverse-sampling variance weights of the condition-specific outcomes). By default, the group sizes (weights) are assumed to be identical across conditions, which implies a correlation of 0.5. If the group sizes (weights) are known, they can be specified via arguments w1
and w2
(in which case this information is used by the function to calculate a more accurate estimate of the correlation induced by the shared condition).
Moreover, a contrast-type effect size can be based on a between- or a within-subjects design. When at least one or more of the contrast-type effect sizes are based on a within-subjects design, then time1
and time2
should be used in combination with grp1
and grp2
to specify for each effect size the group(s) and time point(s) involved.
For example, grp1=c(2,3)
and grp2=c(1,1)
as above in combination with time1=c(1,1)
and time2=c(1,1)
would imply a between-subjects design involving three groups where two effect sizes were computed contrasting groups 2 and 3 versus group 1 at a single time point. On the other hand, grp1=c(1,1)
and grp2=c(1,1)
in combination with time1=c(2,3)
and time2=c(1,1)
would imply a within-subjects design where two effect sizes were computed contrasting time points 2 and 3 versus time point 1 in a single group. Argument phi
is then used as above to specify the autocorrelation of the measurements within groups (i.e., for the within-subjects design above, it would be the autocorrelation between time points 2 and 1 or equivalently, between time points 3 and 2).
All of the arguments above can be specified together to account for a fairly wide variety of dependency types.
rvars
ArgumentThe function also provides an alternative approach for constructing the variance-covariance matrix using the rvars
argument. Here, one must specify the names of the variables in the dataset that correspond to the correlation matrices of the studies (the variables should be specified as a vector; e.g., c(var1, var2, var3)
).
In particular, let \(k_i\) denote the number of rows corresponding to the \(i\textrm{th}\) cluster. Then the values of the first \(k_i\) variables from rvars
are used to construct the correlation matrix and, together with the sampling variances (specified via vi
), the variance-covariance matrix. Say there are three studies, the first with two correlated estimates, the second with a single estimate, and the third with four correlated estimates. Then the data structure should look like this:
study yi vi r1 r2 r3 r4
=============================
1 . . 1 NA NA NA
1 . . .6 1 NA NA
-----------------------------
2 . . 1 NA NA NA
-----------------------------
3 . . 1 NA NA NA
3 . . .8 1 NA NA
3 . . .5 .5 1 NA
3 . . .5 .5 .8 1
=============================
with rvars = c(r1, r2, r3, r4)
. If the rvars
variables are a consecutive set in the data frame (as above), then one can use the shorthand notation rvars = c(r1:r4)
, so r1
denotes the first and r4
the last variable in the set. Note that only the lower triangular part of the submatrices defined by the rvars
variables is used.
There must be as many variables specified via rvars
as the number of rows in the largest cluster (in smaller clusters, the non-relevant variables can just be set to NA
; see above).
A \(k \times k\) variance-covariance matrix (given as a sparse matrix when sparse=TRUE
), where \(k\) denotes the length of the vi
variable (i.e., the number of rows in the dataset).
Depending on the data structure, the specified variables, and the specified values for rho
and/or phi
, it is possible that the constructed variance-covariance matrix is not positive definite within one or more clusters (this is checked when checkpd=TRUE
, which is the default). If such non-positive definite submatrices are found, the reasons for this should be carefully checked since this might indicate misapplication of the function and/or the specification of implausible values for rho
and/or phi
.
When setting nearpd=TRUE
, the nearPD
function from the Matrix package is used on variance-covariance submatrices that are not positive definite. This should only be used cautiously and after understanding why these matrices are not positive definite.
Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. https://doi.org/10.18637/jss.v036.i03
escalc
for a function to compute the observed effect sizes or outcomes (and corresponding sampling variances) for which a variance-covariance matrix could be constructed.
rcalc
for a function to construct the variance-covariance matrix of dependent correlation coefficients.
rma.mv
for a model fitting function that can be used to meta-analyze dependent effect sizes or outcomes.
############################################################################
### see help(dat.assink2016) for further details on this dataset
dat <- dat.assink2016
head(dat, 9)
#>
#> study esid id yi vi pubstatus year deltype
#> 1 1 1 1 0.9066 0.0740 1 4.5 general
#> 2 1 2 2 0.4295 0.0398 1 4.5 general
#> 3 1 3 3 0.2679 0.0481 1 4.5 general
#> 4 1 4 4 0.2078 0.0239 1 4.5 general
#> 5 1 5 5 0.0526 0.0331 1 4.5 general
#> 6 1 6 6 -0.0507 0.0886 1 4.5 general
#> 7 2 1 7 0.5117 0.0115 1 1.5 general
#> 8 2 2 8 0.4738 0.0076 1 1.5 general
#> 9 2 3 9 0.3544 0.0065 1 1.5 general
#>
### assume that the effect sizes within studies are correlated with rho=0.6
V <- vcalc(vi, cluster=study, obs=esid, data=dat, rho=0.6)
### show part of V matrix for studies 1 and 2
round(V[dat$study %in% c(1,2), dat$study %in% c(1,2)], 4)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#> [1,] 0.0740 0.0326 0.0358 0.0252 0.0297 0.0486 0.0000 0.0000 0.0000
#> [2,] 0.0326 0.0398 0.0263 0.0185 0.0218 0.0356 0.0000 0.0000 0.0000
#> [3,] 0.0358 0.0263 0.0481 0.0203 0.0239 0.0392 0.0000 0.0000 0.0000
#> [4,] 0.0252 0.0185 0.0203 0.0239 0.0169 0.0276 0.0000 0.0000 0.0000
#> [5,] 0.0297 0.0218 0.0239 0.0169 0.0331 0.0325 0.0000 0.0000 0.0000
#> [6,] 0.0486 0.0356 0.0392 0.0276 0.0325 0.0886 0.0000 0.0000 0.0000
#> [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0115 0.0056 0.0052
#> [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0056 0.0076 0.0042
#> [9,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0052 0.0042 0.0065
### or show as list of matrices
blsplit(V, dat$study, round, 4)[1:2]
#> $`1`
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0.0740 0.0326 0.0358 0.0252 0.0297 0.0486
#> [2,] 0.0326 0.0398 0.0263 0.0185 0.0218 0.0356
#> [3,] 0.0358 0.0263 0.0481 0.0203 0.0239 0.0392
#> [4,] 0.0252 0.0185 0.0203 0.0239 0.0169 0.0276
#> [5,] 0.0297 0.0218 0.0239 0.0169 0.0331 0.0325
#> [6,] 0.0486 0.0356 0.0392 0.0276 0.0325 0.0886
#>
#> $`2`
#> [,1] [,2] [,3]
#> [1,] 0.0115 0.0056 0.0052
#> [2,] 0.0056 0.0076 0.0042
#> [3,] 0.0052 0.0042 0.0065
#>
### use a correlation of 0.7 for effect sizes corresponding to the same type of
### delinquent behavior and a correlation of 0.5 for effect sizes corresponding
### to different types of delinquent behavior
V <- vcalc(vi, cluster=study, type=deltype, obs=esid, data=dat, rho=c(0.7, 0.5))
blsplit(V, dat$study, round, 3)[16]
#> $`16`
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
#> [1,] 0.091 0.045 0.027 0.044 0.030 0.039 0.076 0.028 0.034 0.030 0.039 0.043 0.039 0.067 0.028 0.032
#> [2,] 0.045 0.087 0.027 0.061 0.030 0.039 0.053 0.027 0.047 0.041 0.053 0.059 0.053 0.046 0.038 0.043
#> [3,] 0.027 0.027 0.033 0.027 0.025 0.033 0.033 0.023 0.021 0.018 0.023 0.026 0.023 0.029 0.017 0.019
#> [4,] 0.044 0.061 0.027 0.086 0.029 0.038 0.053 0.027 0.047 0.041 0.053 0.058 0.053 0.046 0.038 0.043
#> [5,] 0.030 0.030 0.025 0.029 0.040 0.037 0.036 0.026 0.023 0.020 0.026 0.028 0.026 0.031 0.018 0.021
#> [6,] 0.039 0.039 0.033 0.038 0.037 0.068 0.047 0.033 0.030 0.026 0.034 0.037 0.034 0.041 0.024 0.027
#> [7,] 0.076 0.053 0.033 0.053 0.036 0.047 0.129 0.033 0.041 0.035 0.046 0.051 0.046 0.079 0.033 0.037
#> [8,] 0.028 0.027 0.023 0.027 0.026 0.033 0.033 0.033 0.021 0.018 0.023 0.026 0.024 0.029 0.017 0.019
#> [9,] 0.034 0.047 0.021 0.047 0.023 0.030 0.041 0.021 0.052 0.031 0.041 0.045 0.041 0.036 0.029 0.033
#> [10,] 0.030 0.041 0.018 0.041 0.020 0.026 0.035 0.018 0.031 0.039 0.036 0.039 0.036 0.031 0.025 0.029
#> [11,] 0.039 0.053 0.023 0.053 0.026 0.034 0.046 0.023 0.041 0.036 0.066 0.051 0.047 0.040 0.033 0.038
#> [12,] 0.043 0.059 0.026 0.058 0.028 0.037 0.051 0.026 0.045 0.039 0.051 0.081 0.051 0.045 0.037 0.042
#> [13,] 0.039 0.053 0.023 0.053 0.026 0.034 0.046 0.024 0.041 0.036 0.047 0.051 0.067 0.041 0.033 0.038
#> [14,] 0.067 0.046 0.029 0.046 0.031 0.041 0.079 0.029 0.036 0.031 0.040 0.045 0.041 0.099 0.029 0.033
#> [15,] 0.028 0.038 0.017 0.038 0.018 0.024 0.033 0.017 0.029 0.025 0.033 0.037 0.033 0.029 0.034 0.027
#> [16,] 0.032 0.043 0.019 0.043 0.021 0.027 0.037 0.019 0.033 0.029 0.038 0.042 0.038 0.033 0.027 0.044
#>
### examine the correlation matrix for study 16
blsplit(V, dat$study, cov2cor)[16]
#> $`16`
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
#> [1,] 1.0 0.5 0.5 0.5 0.5 0.5 0.7 0.5 0.5 0.5 0.5 0.5 0.5 0.7 0.5 0.5
#> [2,] 0.5 1.0 0.5 0.7 0.5 0.5 0.5 0.5 0.7 0.7 0.7 0.7 0.7 0.5 0.7 0.7
#> [3,] 0.5 0.5 1.0 0.5 0.7 0.7 0.5 0.7 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
#> [4,] 0.5 0.7 0.5 1.0 0.5 0.5 0.5 0.5 0.7 0.7 0.7 0.7 0.7 0.5 0.7 0.7
#> [5,] 0.5 0.5 0.7 0.5 1.0 0.7 0.5 0.7 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
#> [6,] 0.5 0.5 0.7 0.5 0.7 1.0 0.5 0.7 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
#> [7,] 0.7 0.5 0.5 0.5 0.5 0.5 1.0 0.5 0.5 0.5 0.5 0.5 0.5 0.7 0.5 0.5
#> [8,] 0.5 0.5 0.7 0.5 0.7 0.7 0.5 1.0 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
#> [9,] 0.5 0.7 0.5 0.7 0.5 0.5 0.5 0.5 1.0 0.7 0.7 0.7 0.7 0.5 0.7 0.7
#> [10,] 0.5 0.7 0.5 0.7 0.5 0.5 0.5 0.5 0.7 1.0 0.7 0.7 0.7 0.5 0.7 0.7
#> [11,] 0.5 0.7 0.5 0.7 0.5 0.5 0.5 0.5 0.7 0.7 1.0 0.7 0.7 0.5 0.7 0.7
#> [12,] 0.5 0.7 0.5 0.7 0.5 0.5 0.5 0.5 0.7 0.7 0.7 1.0 0.7 0.5 0.7 0.7
#> [13,] 0.5 0.7 0.5 0.7 0.5 0.5 0.5 0.5 0.7 0.7 0.7 0.7 1.0 0.5 0.7 0.7
#> [14,] 0.7 0.5 0.5 0.5 0.5 0.5 0.7 0.5 0.5 0.5 0.5 0.5 0.5 1.0 0.5 0.5
#> [15,] 0.5 0.7 0.5 0.7 0.5 0.5 0.5 0.5 0.7 0.7 0.7 0.7 0.7 0.5 1.0 0.7
#> [16,] 0.5 0.7 0.5 0.7 0.5 0.5 0.5 0.5 0.7 0.7 0.7 0.7 0.7 0.5 0.7 1.0
#>
############################################################################
### see help(dat.ishak2007) for further details on this dataset
dat <- dat.ishak2007
head(dat, 5)
#>
#> study y1i v1i y2i v2i y3i v3i y4i v4i mdur mbase
#> 1 Alegret (2001) -33.4 14.3 NA NA NA NA NA NA 16.1 53.6
#> 2 Barichella (2003) -20.0 7.3 NA NA -30.0 5.7 NA NA 13.5 45.3
#> 3 Berney (2002) -21.1 7.3 NA NA NA NA NA NA 13.6 45.6
#> 4 Burchiel (1999) -20.0 8.0 -20.0 8.0 -18.0 5.0 NA NA 13.6 48.0
#> 5 Chen (2003) NA NA -32.9 125.0 NA NA NA NA 12.1 65.7
#>
### create long format dataset
dat <- reshape(dat, direction="long", idvar="study", v.names=c("yi","vi"),
varying=list(c(2,4,6,8), c(3,5,7,9)))
dat <- dat[order(study, time),]
### remove missing measurement occasions from dat
dat <- dat[!is.na(yi),]
rownames(dat) <- NULL
### show the data for the first 5 studies
head(dat, 8)
#>
#> study mdur mbase time yi vi
#> 1 Alegret (2001) 16.1 53.6 1 -33.4 14.3
#> 2 Barichella (2003) 13.5 45.3 1 -20.0 7.3
#> 3 Barichella (2003) 13.5 45.3 3 -30.0 5.7
#> 4 Berney (2002) 13.6 45.6 1 -21.1 7.3
#> 5 Burchiel (1999) 13.6 48.0 1 -20.0 8.0
#> 6 Burchiel (1999) 13.6 48.0 2 -20.0 8.0
#> 7 Burchiel (1999) 13.6 48.0 3 -18.0 5.0
#> 8 Chen (2003) 12.1 65.7 2 -32.9 125.0
#>
### construct the full (block diagonal) V matrix with an AR(1) structure
### assuming an autocorrelation of 0.97 as estimated by Ishak et al. (2007)
V <- vcalc(vi, cluster=study, time1=time, phi=0.97, data=dat)
V[1:8, 1:8]
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 14.3 0.000000 0.000000 0.0 0.000000 0.000000 0.000000 0
#> [2,] 0.0 7.300000 6.069352 0.0 0.000000 0.000000 0.000000 0
#> [3,] 0.0 6.069352 5.700000 0.0 0.000000 0.000000 0.000000 0
#> [4,] 0.0 0.000000 0.000000 7.3 0.000000 0.000000 0.000000 0
#> [5,] 0.0 0.000000 0.000000 0.0 8.000000 7.760000 5.950774 0
#> [6,] 0.0 0.000000 0.000000 0.0 7.760000 8.000000 6.134819 0
#> [7,] 0.0 0.000000 0.000000 0.0 5.950774 6.134819 5.000000 0
#> [8,] 0.0 0.000000 0.000000 0.0 0.000000 0.000000 0.000000 125
cov2cor(V[1:8, 1:8])
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1 0.0000 0.0000 0 0.0000 0.00 0.0000 0
#> [2,] 0 1.0000 0.9409 0 0.0000 0.00 0.0000 0
#> [3,] 0 0.9409 1.0000 0 0.0000 0.00 0.0000 0
#> [4,] 0 0.0000 0.0000 1 0.0000 0.00 0.0000 0
#> [5,] 0 0.0000 0.0000 0 1.0000 0.97 0.9409 0
#> [6,] 0 0.0000 0.0000 0 0.9700 1.00 0.9700 0
#> [7,] 0 0.0000 0.0000 0 0.9409 0.97 1.0000 0
#> [8,] 0 0.0000 0.0000 0 0.0000 0.00 0.0000 1
### or show as a list of matrices
blsplit(V, dat$study)[1:5]
#> $`Alegret (2001)`
#> [,1]
#> [1,] 14.3
#>
#> $`Barichella (2003)`
#> [,1] [,2]
#> [1,] 7.300000 6.069352
#> [2,] 6.069352 5.700000
#>
#> $`Berney (2002)`
#> [,1]
#> [1,] 7.3
#>
#> $`Burchiel (1999)`
#> [,1] [,2] [,3]
#> [1,] 8.000000 7.760000 5.950774
#> [2,] 7.760000 8.000000 6.134819
#> [3,] 5.950774 6.134819 5.000000
#>
#> $`Chen (2003)`
#> [,1]
#> [1,] 125
#>
blsplit(V, dat$study, cov2cor)[1:5]
#> $`Alegret (2001)`
#> [,1]
#> [1,] 1
#>
#> $`Barichella (2003)`
#> [,1] [,2]
#> [1,] 1.0000 0.9409
#> [2,] 0.9409 1.0000
#>
#> $`Berney (2002)`
#> [,1]
#> [1,] 1
#>
#> $`Burchiel (1999)`
#> [,1] [,2] [,3]
#> [1,] 1.0000 0.97 0.9409
#> [2,] 0.9700 1.00 0.9700
#> [3,] 0.9409 0.97 1.0000
#>
#> $`Chen (2003)`
#> [,1]
#> [1,] 1
#>
############################################################################
### see help(dat.kalaian1996) for further details on this dataset
dat <- dat.kalaian1996
head(dat, 12)
#> id study year n1i n2i outcome yi vi hrs ets homework type
#> 1 1 Alderman & Powers (A) 1980 28 22 verbal 0.22 0.0817 7.0 1 1 1
#> 2 2 Alderman & Powers (B) 1980 39 40 verbal 0.09 0.0507 10.0 1 1 1
#> 3 3 Alderman & Powers (C) 1980 22 17 verbal 0.14 0.1045 10.5 1 1 1
#> 4 4 Alderman & Powers (D) 1980 48 43 verbal 0.14 0.0442 10.0 1 1 1
#> 5 5 Alderman & Powers (E) 1980 25 74 verbal -0.01 0.0535 6.0 1 1 1
#> 6 6 Alderman & Powers (F) 1980 37 35 verbal 0.14 0.0557 5.0 1 1 1
#> 7 7 Alderman & Powers (G) 1980 24 70 verbal 0.18 0.0561 11.0 1 1 1
#> 8 8 Alderman & Powers (H) 1980 16 19 verbal 0.01 0.1151 45.0 1 1 1
#> 9 9 Evans & Pike (A) 1973 145 129 verbal 0.13 0.0147 21.0 1 1 1
#> 10 10 Evans & Pike (A) 1973 145 129 math 0.12 0.0147 21.0 1 1 1
#> 11 11 Evans & Pike (B) 1973 72 129 verbal 0.25 0.0218 21.0 1 1 1
#> 12 12 Evans & Pike (B) 1973 72 129 math 0.06 0.0216 21.0 1 1 1
### construct the variance-covariance matrix assuming rho = 0.66 for effect sizes
### corresponding to the 'verbal' and 'math' outcome types
V <- vcalc(vi, cluster=study, type=outcome, data=dat, rho=0.66)
round(V[1:12,1:12], 4)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#> [1,] 0.0817 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [2,] 0.0000 0.0507 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [3,] 0.0000 0.0000 0.1045 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [4,] 0.0000 0.0000 0.0000 0.0442 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [5,] 0.0000 0.0000 0.0000 0.0000 0.0535 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [6,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0557 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0561 0.0000 0.0000 0.0000 0.0000 0.0000
#> [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1151 0.0000 0.0000 0.0000 0.0000
#> [9,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0147 0.0097 0.0000 0.0000
#> [10,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0097 0.0147 0.0000 0.0000
#> [11,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0218 0.0143
#> [12,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0143 0.0216
############################################################################
### see help(dat.berkey1998) for further details on this dataset
dat <- dat.berkey1998
### variables v1i and v2i correspond to the 2x2 var-cov matrices of the studies;
### so use these variables to construct the V matrix (note: since v1i and v2i are
### var-cov matrices and not correlation matrices, set vi=1 for all rows)
V <- vcalc(vi=1, cluster=author, rvars=c(v1i, v2i), data=dat)
V
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 0.0075 0.0030 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [2,] 0.0030 0.0077 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [3,] 0.0000 0.0000 0.0057 9e-04 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [4,] 0.0000 0.0000 0.0009 8e-04 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [5,] 0.0000 0.0000 0.0000 0e+00 0.0021 0.0007 0.0000 0.0000 0.0000 0.0000
#> [6,] 0.0000 0.0000 0.0000 0e+00 0.0007 0.0014 0.0000 0.0000 0.0000 0.0000
#> [7,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0029 0.0009 0.0000 0.0000
#> [8,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0009 0.0015 0.0000 0.0000
#> [9,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0148 0.0072
#> [10,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0072 0.0304
round(cov2cor(V), 2)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1.00 0.39 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
#> [2,] 0.39 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
#> [3,] 0.00 0.00 1.00 0.42 0.00 0.00 0.00 0.00 0.00 0.00
#> [4,] 0.00 0.00 0.42 1.00 0.00 0.00 0.00 0.00 0.00 0.00
#> [5,] 0.00 0.00 0.00 0.00 1.00 0.41 0.00 0.00 0.00 0.00
#> [6,] 0.00 0.00 0.00 0.00 0.41 1.00 0.00 0.00 0.00 0.00
#> [7,] 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.43 0.00 0.00
#> [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.43 1.00 0.00 0.00
#> [9,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.34
#> [10,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.34 1.00
### or show as a list of matrices
blsplit(V, dat$author, function(x) round(cov2cor(x), 2))
#> $`Pihlstrom et al.`
#> [,1] [,2]
#> [1,] 1.00 0.39
#> [2,] 0.39 1.00
#>
#> $`Lindhe et al.`
#> [,1] [,2]
#> [1,] 1.00 0.42
#> [2,] 0.42 1.00
#>
#> $`Knowles et al.`
#> [,1] [,2]
#> [1,] 1.00 0.41
#> [2,] 0.41 1.00
#>
#> $`Ramfjord et al.`
#> [,1] [,2]
#> [1,] 1.00 0.43
#> [2,] 0.43 1.00
#>
#> $`Becker et al.`
#> [,1] [,2]
#> [1,] 1.00 0.34
#> [2,] 0.34 1.00
#>
### construct the variance-covariance matrix assuming rho = 0.4 for effect sizes
### corresponding to the 'PD' and 'AL' outcome types
V <- vcalc(vi=vi, cluster=trial, type=outcome, data=dat, rho=0.4)
round(V,4)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 0.0075 0.0030 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [2,] 0.0030 0.0077 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [3,] 0.0000 0.0000 0.0057 9e-04 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [4,] 0.0000 0.0000 0.0009 8e-04 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#> [5,] 0.0000 0.0000 0.0000 0e+00 0.0021 0.0007 0.0000 0.0000 0.0000 0.0000
#> [6,] 0.0000 0.0000 0.0000 0e+00 0.0007 0.0014 0.0000 0.0000 0.0000 0.0000
#> [7,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0029 0.0008 0.0000 0.0000
#> [8,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0008 0.0015 0.0000 0.0000
#> [9,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0148 0.0085
#> [10,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0085 0.0304
cov2cor(V)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1.0 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
#> [2,] 0.4 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
#> [3,] 0.0 0.0 1.0 0.4 0.0 0.0 0.0 0.0 0.0 0.0
#> [4,] 0.0 0.0 0.4 1.0 0.0 0.0 0.0 0.0 0.0 0.0
#> [5,] 0.0 0.0 0.0 0.0 1.0 0.4 0.0 0.0 0.0 0.0
#> [6,] 0.0 0.0 0.0 0.0 0.4 1.0 0.0 0.0 0.0 0.0
#> [7,] 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.4 0.0 0.0
#> [8,] 0.0 0.0 0.0 0.0 0.0 0.0 0.4 1.0 0.0 0.0
#> [9,] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.4
#> [10,] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 1.0
############################################################################
### see help(dat.knapp2017) for further details on this dataset
dat <- dat.knapp2017
dat[-c(1:2)]
#> study task difficulty group1 group2 comp yi vi n_sz n_hc yi_iq vi_iq
#> 1 1 SOC (CANTAB) 3.83 SZ HC 1 0.62 0.075 24 33 0.72 0.076
#> 2 2 SOC (CANTAB) 3.83 SZ1 HC 1 0.46 0.047 44 44 NA NA
#> 3 2 SOC (CANTAB) 3.83 SZ2 HC 2 0.73 0.052 38 44 NA NA
#> 4 3 SOC (CANTAB) 3.83 SZ1 HC 1 1.18 0.062 39 37 NA NA
#> 5 3 SOC (CANTAB) 3.83 SZ2 HC 2 1.09 0.073 27 37 NA NA
#> 6 3 SOC (CANTAB) 3.83 SZ3 HC 3 0.91 0.102 15 37 NA NA
#> 7 3 SOC (CANTAB) 3.83 SZ4 HC 4 0.91 0.084 20 37 NA NA
#> 8 4 TOH NA SZ HC 1 0.57 0.074 28 28 NA NA
#> 9 5 TOL NA SZ HC 1 0.40 0.180 13 10 NA NA
#> 10 6 SOC (CANTAB) 2.00 SZ HC 1 0.43 0.170 12 12 0.12 0.167
#> 11 6 SOC (CANTAB) 3.00 SZ HC 2 1.37 0.206 12 12 0.12 0.167
#> 12 6 SOC (CANTAB) 4.00 SZ HC 3 0.23 0.168 12 12 0.12 0.167
#> 13 6 SOC (CANTAB) 5.00 SZ HC 4 0.97 0.186 12 12 0.12 0.167
#> 14 6 SOC (one-touch) 2.00 SZ HC 5 0.56 0.173 12 12 0.12 0.167
#> 15 6 SOC (one-touch) 3.00 SZ HC 6 0.79 0.180 12 12 0.12 0.167
#> 16 6 SOC (one-touch) 4.00 SZ HC 7 1.19 0.196 12 12 0.12 0.167
#> 17 6 SOC (one-touch) 5.00 SZ HC 8 0.92 0.184 12 12 0.12 0.167
#> 18 7 SOC 3.50 SZ HC 1 0.20 0.096 22 20 0.62 0.100
#> 19 8 TOH NA SZ HC 1 1.42 0.180 13 15 NA NA
#> 20 9 TOL 4.00 SZ1 HC 1 0.35 0.074 27 28 1.21 0.086
#> 21 9 TOL 4.00 SZ2 HC 2 0.88 0.078 28 28 1.26 0.086
#> 22 10 SOC (CANTAB) 3.83 SZ HC 1 1.08 0.079 26 33 0.76 0.074
#> 23 11 SOC (CANTAB) 3.83 SZ HC 1 0.94 0.111 20 20 0.42 0.102
#> 24 12 SOC (CANTAB) 2.00 SZ HC 1 0.44 0.020 135 81 0.49 0.020
#> 25 12 SOC (CANTAB) 3.00 SZ HC 2 0.63 0.021 135 81 0.49 0.020
#> 26 12 SOC (CANTAB) 4.00 SZ HC 3 0.38 0.020 135 81 0.49 0.020
#> 27 12 SOC (CANTAB) 5.00 SZ HC 4 0.93 0.022 135 81 0.49 0.020
#> 28 13 TOL NA SZ HC 1 0.33 0.102 24 17 NA NA
#> 29 14 TOL 2.00 SZ HC 1 0.00 0.073 32 24 NA NA
#> 30 14 TOL 3.00 SZ HC 2 -0.06 0.073 32 24 NA NA
#> 31 14 TOL 4.00 SZ HC 3 0.73 0.078 32 24 NA NA
#> 32 14 TOL 5.00 SZ HC 4 0.48 0.075 32 24 NA NA
#> 33 15 TOL 2.00 SZ HC 1 0.43 0.092 25 20 NA NA
#> 34 15 TOL 3.00 SZ HC 2 0.72 0.096 25 20 NA NA
#> 35 15 TOL 4.00 SZ HC 3 0.80 0.097 25 20 NA NA
#> 36 15 TOL 5.00 SZ HC 4 1.30 0.109 25 20 NA NA
#> 37 16 TOL 4.67 SZ HC 1 1.84 0.190 15 15 NA NA
#> 38 17 TOL 3.50 SZ HC 1 2.58 0.216 17 17 0.58 0.123
#> 39 18 TOL 3.00 SZ HC 1 0.25 0.071 30 27 0.18 0.071
#> 40 18 TOL 4.00 SZ HC 2 0.86 0.077 30 27 0.18 0.071
#> 41 18 TOL 5.00 SZ HC 3 0.68 0.074 30 27 0.18 0.071
#> 42 19 SOC (CANTAB) 2.00 SZ HC 1 0.21 0.060 36 31 0.48 0.062
#> 43 19 SOC (CANTAB) 3.00 SZ HC 2 0.65 0.063 36 31 0.48 0.062
#> 44 19 SOC (CANTAB) 4.00 SZ HC 3 0.30 0.061 36 31 0.48 0.062
#> 45 19 SOC (CANTAB) 5.00 SZ HC 4 0.45 0.062 36 31 0.48 0.062
#> 46 20 BACS-J NA SZ1 HC 1 1.03 0.054 20 340 NA NA
#> 47 20 BACS-J NA SZ2 HC 2 0.92 0.104 10 340 NA NA
#> 48 21 SOC (CANTAB) 3.83 SZ HC 1 1.12 0.108 28 17 NA NA
#> 49 22 TOL 2.00 SZ HC 1 0.52 0.052 40 40 NA NA
#> 50 22 TOL 3.00 SZ HC 2 0.50 0.052 40 40 NA NA
#> 51 22 TOL 4.00 SZ HC 3 0.49 0.051 40 40 NA NA
#> 52 23 SOC 3.83 SZ HC 1 0.30 0.042 48 48 NA NA
#> 53 24 TOL-Drexel 5.50 SZ1 HC1 1 0.36 0.022 86 97 0.47 0.023
#> 54 24 TOL-Drexel 5.50 SZ2 HC2 2 0.41 0.030 75 62 0.10 0.030
#> 55 25 SOC 3.83 SZ HC 1 0.79 0.086 25 25 NA NA
#> 56 26 SOC 2.00 SZ HC 1 0.23 0.031 77 55 NA NA
#> 57 26 SOC 3.00 SZ HC 2 0.58 0.032 77 55 NA NA
#> 58 26 SOC 4.00 SZ HC 3 0.57 0.032 77 55 NA NA
#> 59 26 SOC 5.00 SZ HC 4 0.79 0.034 77 55 NA NA
#> 60 27 TOL-Drexel 5.50 SZ HC 1 0.70 0.071 30 30 NA NA
#> 61 28 SOC 3.83 SZ HC 1 1.16 0.136 20 15 0.23 0.117
#> 62 29 BACS-J NA SZ1 HC1 1 1.29 0.039 56 68 0.37 0.033
#> 63 29 BACS-J NA SZ2 HC2 2 0.63 0.039 47 62 0.43 0.038
#> 64 29 BACS-J NA SZ3 HC3 3 0.31 0.121 15 19 1.00 0.134
#> 65 30 TOL 3.83 SZ HC 1 0.11 0.070 30 27 NA NA
#> 66 31 TOL (four-disk) 5.60 SZ HC 1 0.87 0.036 60 60 NA NA
### create variable that indicates the task and difficulty combination as increasing integers
dat$task.diff <- unlist(lapply(split(dat, dat$study), function(x) {
task.int <- as.integer(factor(x$task))
diff.int <- as.integer(factor(x$difficulty))
diff.int[is.na(diff.int)] <- 1
paste0(task.int, ".", diff.int)}))
### construct correlation matrix for two tasks with four different difficulties where the
### correlation is 0.4 for different difficulties of the same task, 0.7 for the same
### difficulty of different tasks, and 0.28 for different difficulties of different tasks
R <- matrix(0.4, nrow=8, ncol=8)
R[5:8,1:4] <- R[1:4,5:8] <- 0.28
diag(R[1:4,5:8]) <- 0.7
diag(R[5:8,1:4]) <- 0.7
diag(R) <- 1
rownames(R) <- colnames(R) <- paste0(rep(1:2, each=4), ".", 1:4)
R
#> 1.1 1.2 1.3 1.4 2.1 2.2 2.3 2.4
#> 1.1 1.00 0.40 0.40 0.40 0.70 0.28 0.28 0.28
#> 1.2 0.40 1.00 0.40 0.40 0.28 0.70 0.28 0.28
#> 1.3 0.40 0.40 1.00 0.40 0.28 0.28 0.70 0.28
#> 1.4 0.40 0.40 0.40 1.00 0.28 0.28 0.28 0.70
#> 2.1 0.70 0.28 0.28 0.28 1.00 0.40 0.40 0.40
#> 2.2 0.28 0.70 0.28 0.28 0.40 1.00 0.40 0.40
#> 2.3 0.28 0.28 0.70 0.28 0.40 0.40 1.00 0.40
#> 2.4 0.28 0.28 0.28 0.70 0.40 0.40 0.40 1.00
### construct an approximate V matrix accounting for the use of shared groups and
### for correlations among tasks/difficulties as specified in the R matrix above
V <- vcalc(vi, cluster=study, grp1=group1, grp2=group2, w1=n_sz, w2=n_hc,
obs=task.diff, rho=R, data=dat)
Vs <- blsplit(V, dat$study)
cov2cor(Vs[[3]]) # study with multiple SZ groups and a single HC group
#> [,1] [,2] [,3] [,4]
#> [1,] 1.0000000 0.4652832 0.3847419 0.4243294
#> [2,] 0.4652832 1.0000000 0.3488477 0.3847419
#> [3,] 0.3847419 0.3488477 1.0000000 0.3181424
#> [4,] 0.4243294 0.3847419 0.3181424 1.0000000
cov2cor(Vs[[6]]) # study with two task types and multiple difficulties
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1.00 0.40 0.40 0.40 0.70 0.28 0.28 0.28
#> [2,] 0.40 1.00 0.40 0.40 0.28 0.70 0.28 0.28
#> [3,] 0.40 0.40 1.00 0.40 0.28 0.28 0.70 0.28
#> [4,] 0.40 0.40 0.40 1.00 0.28 0.28 0.28 0.70
#> [5,] 0.70 0.28 0.28 0.28 1.00 0.40 0.40 0.40
#> [6,] 0.28 0.70 0.28 0.28 0.40 1.00 0.40 0.40
#> [7,] 0.28 0.28 0.70 0.28 0.40 0.40 1.00 0.40
#> [8,] 0.28 0.28 0.28 0.70 0.40 0.40 0.40 1.00
cov2cor(Vs[[12]]) # study with multiple difficulties for the same task
#> [,1] [,2] [,3] [,4]
#> [1,] 1.0 0.4 0.4 0.4
#> [2,] 0.4 1.0 0.4 0.4
#> [3,] 0.4 0.4 1.0 0.4
#> [4,] 0.4 0.4 0.4 1.0
cov2cor(Vs[[24]]) # study with separate rows for males and females
#> [,1] [,2]
#> [1,] 1 0
#> [2,] 0 1
cov2cor(Vs[[29]]) # study with separate rows for three genotypes
#> [,1] [,2] [,3]
#> [1,] 1 0 0
#> [2,] 0 1 0
#> [3,] 0 0 1
############################################################################