Function to construct or approximate the variance-covariance matrix of dependent effect sizes or outcomes, or more precisely, of their sampling errors (i.e., the V matrix in rma.mv).

vcalc(vi, cluster, subgroup, obs, type, time1, time2, grp1, grp2, w1, w2,
      data, rho, phi, rvars, checkpd=TRUE, nearpd=FALSE, sparse=FALSE, ...)

Arguments

vi

numeric vector to specify the sampling variances of the observed effect sizes or outcomes.

cluster

vector to specify the clustering variable (e.g., study).

subgroup

optional vector to specify different (independent) subgroups within clusters.

obs

optional vector to distinguish different observed effect sizes or outcomes corresponding to the same construct or response/dependent variable.

type

optional vector to distinguish different types of constructs or response/dependent variables underlying the observed effect sizes or outcomes.

time1

optional numeric vector to specify the time points when the observed effect sizes or outcomes were obtained (in the first condition if the observed effect sizes or outcomes represent contrasts between two conditions).

time2

optional numeric vector to specify the time points when the observed effect sizes or outcomes were obtained in the second condition (only relevant when the observed effect sizes or outcomes represent contrasts between two conditions).

grp1

optional vector to specify the group of the first condition when the observed effect sizes or outcomes represent contrasts between two conditions.

grp2

optional vector to specify the group of the second condition when the observed effect sizes or outcomes represent contrasts between two conditions.

w1

optional numeric vector to specify the size of the group (or more generally, the inverse-sampling variance weight) of the first condition when the observed effect sizes or outcomes represent contrasts between two conditions.

w2

optional numeric vector to specify the size of the group (or more generally, the inverse-sampling variance weight) of the second condition when the observed effect sizes or outcomes represent contrasts between two conditions.

data

optional data frame containing the variables given to the arguments above.

rho

argument to specify the correlation(s) of observed effect sizes or outcomes measured concurrently. See ‘Details’.

phi

argument to specify the autocorrelation of observed effect sizes or outcomes measured at different time points. See ‘Details’.

rvars

optional argument for specifying the variables that correspond to the correlation matrices of the studies (if this is specified, all arguments above except for cluster and subgroup are ignored). See ‘Details’.

checkpd

logical to specify whether to check that the variance-covariance matrices within clusters are positive definite (the default is TRUE). See ‘Note’.

nearpd

logical to specify whether the nearPD function from the Matrix package should be used on variance-covariance matrices that are not positive definite. See ‘Note’.

sparse

logical to specify whether the variance-covariance matrix should be returned as a sparse matrix (the default is FALSE).

...

other arguments.

Details

Standard meta-analytic models (such as those that can be fitted with the rma.uni function) assume that the observed effect sizes or outcomes (or more precisely, their sampling errors) are independent. This assumption is typically violated whenever multiple observed effect sizes or outcomes are computed based on the same sample of subjects (or whatever the study units are) or if there is at least partial overlap of subjects that contribute information to the computation of multiple effect sizes or outcomes.

The present function can be used to construct or approximate the variance-covariance matrix of the sampling errors of dependent effect sizes or outcomes for a wide variety of circumstances (this variance-covariance matrix is the so-called V matrix that may be needed as input for multilevel/multivariate meta-analytic models as can be fitted with the rma.mv function; see also here for some recommendations on a general workflow for meta-analyses involving complex dependency structures).

Argument cluster is used to specify the clustering variable. Rows with the same value of this variable are allowed to be dependent, while rows with different values are assumed to be independent. Typically, cluster will be a study identifier.

Within the same cluster, there may be different subgroups with no overlap of subjects across subgroups. Argument subgroup can be used to distinguish such subgroups. Rows with the same value of this variable within a cluster are allowed to be dependent, while rows with different values are assumed to be independent even if they come from the same cluster. Therefore, from hereon, ‘cluster’ really refers to the combination of cluster and subgroup.

Multiple effect sizes or outcomes belonging to the same cluster may be dependent due to a variety of reasons:

  1. The same construct of interest (e.g., severity of depression) may have been measured using different scales or instruments within a study (e.g., using the Beck Depression Inventory (BDI) and the Hamilton Depression Rating Scale (HDRS)) based on which multiple effect sizes can be computed for the same group of subjects (e.g., contrasting a treatment versus a control group with respect to each scale). In this case, we have multiple effect sizes that are different ‘observations’ of the effect with respect to the same type of construct.

    Argument obs is then used to distinguish different effect sizes corresponding to the same construct. If obs is specified, then argument rho must also be used to specify the degree of correlation among the sampling errors of the different effect sizes. Since this correlation is typically not known, the correlation among the various scales (or a rough ‘guestimate’ thereof) can be used as a proxy (i.e., the (typical) correlation between BDI and HDRS measurements).

    One can also pass an entire correlation matrix via rho to specify, for each possible pair of obs values, the corresponding correlation. The row/column names of the matrix must then correspond to the unique values of the obs variable.

  2. Multiple types of constructs (or more generally, types of response/dependent variables) may have been measured in the same group of subjects (e.g., severity of depression as measured with the Beck Depression Inventory (BDI) and severity of anxiety as measured with the State-Trait Anxiety Inventory (STAI)). If this is of interest for a meta-analysis, effect sizes can then be computed with respect to each ‘type’ of construct.

    Argument type is then used to distinguish effect sizes corresponding to these different types of constructs. If type is specified, then argument rho must also be used to specify the degree of correlation among the sampling errors of effect sizes belonging to these different types. As above, the correlation among the various scales is typically used here as a proxy (i.e., the (typical) correlation between BDI and STAI measurements).

    One can also pass an entire correlation matrix via rho to specify, for each possible pair of type values, the corresponding correlation. The row/column names of the matrix must then correspond to the unique values of the type variable.

  3. If there are multiple types of constructs, multiple scales or instruments may also have been used (in at least some of the studies) to measure the same construct and hence there may again be multiple effect sizes that are ‘observations’ of the same type of construct. Arguments type and obs should then be used together to specify the various construct types and observations thereof. In this case, argument rho must be a vector of two values, the first to specify the within-construct correlation and the second to specify the between-construct correlation.

    One can also specify a list with two elements for rho, the first element being either a scalar or an entire correlation matrix for the within-construct correlation(s) and the second element being a scalar or an entire correlation matrix for the between-construct correlation(s). As above, any matrices specified must have row/column names corresponding to the unique values of the obs and/or type variables.

  4. The same construct and scale may have been assessed/used multiple times, allowing the computation of multiple effect sizes for the same group of subjects at different time points (e.g., right after the end of a treatment, at a short-term follow-up, and at a long-term follow-up). Argument time1 is then used to specify the time points when the observed effect sizes were obtained. Argument phi must then also be used to specify the autocorrelation among the sampling errors of two effect sizes that differ by one unit on the time1 variable. As above, the autocorrelation of the measurements themselves can be used here as a proxy.

    If multiple constructs and/or multiple scales have also been assessed at the various time points, then arguments type and/or obs (together with argument rho) should be used as needed to differentiate effect sizes corresponding to the different constructs and/or scales.

  5. Many effect sizes or outcome measures (e.g., raw or standardized mean differences, log-transformed ratios of means, log risk/odds ratios and risk differences) reflect the difference between two conditions (i.e., a contrast). Within a study, there may be more than two conditions, allowing the computation of multiple such contrasts (e.g., treatment A versus a control condition and treatment B versus the same control condition) and hence corresponding effect sizes. The reuse of information from the ‘shared’ condition (in this example, the control condition) then induces correlation among the effect sizes.

    To account for this, arguments grp1 and grp2 should be used to specify (within each cluster) which two groups were compared in the computation of each effect size (e.g., in the example above, the coding could be grp1=c(2,3) and grp2=c(1,1); whether numbers or strings are used as identifiers is irrelevant).

    The degree of correlation between two contrast-type effect sizes that is induced by the use of a shared condition is a function of the size of the groups involved in the computation of the two effect sizes (or, more generally, the inverse-sampling variance weights of the condition-specific outcomes). By default, the group sizes (weights) are assumed to be identical across conditions, which implies a correlation of 0.5. If the group sizes (weights) are known, they can be specified via arguments w1 and w2 (in which case this information is used by the function to calculate a more accurate estimate of the correlation induced by the shared condition).

    Moreover, a contrast-type effect size can be based on a between- or a within-subjects design. When at least one or more of the contrast-type effect sizes are based on a within-subjects design, then time1 and time2 should be used in combination with grp1 and grp2 to specify for each effect size the group(s) and time point(s) involved.

    For example, grp1=c(2,3) and grp2=c(1,1) as above in combination with time1=c(1,1) and time2=c(1,1) would imply a between-subjects design involving three groups where two effect sizes were computed contrasting groups 2 and 3 versus group 1 at a single time point. On the other hand, grp1=c(1,1) and grp2=c(1,1) in combination with time1=c(2,3) and time2=c(1,1) would imply a within-subjects design where two effect sizes were computed contrasting time points 2 and 3 versus time point 1 in a single group. Argument phi is then used as above to specify the autocorrelation of the measurements within groups (i.e., for the within-subjects design above, it would be the autocorrelation between time points 2 and 1 or equivalently, between time points 3 and 2).

All of the arguments above can be specified together to account for a fairly wide variety of dependency types.

Using the rvars Argument

The function also provides an alternative approach for constructing the variance-covariance matrix using the rvars argument. Here, one must specify the names of the variables in the dataset that correspond to the correlation matrices of the studies (the variables should be specified as a vector; e.g., c(var1, var2, var3)).

In particular, let \(k_i\) denote the number of rows corresponding to the \(i\textrm{th}\) cluster. Then the values of the first \(k_i\) variables from rvars are used to construct the correlation matrix and, together with the sampling variances (specified via vi), the variance-covariance matrix. Say there are three studies, the first with two correlated estimates, the second with a single estimate, and the third with four correlated estimates. Then the data structure should look like this:

study  yi  vi  r1  r2  r3  r4
=============================
    1   .   .   1  NA  NA  NA
    1   .   .  .6   1  NA  NA
-----------------------------
    2   .   .   1  NA  NA  NA
-----------------------------
    3   .   .   1  NA  NA  NA
    3   .   .  .8   1  NA  NA
    3   .   .  .5  .5   1  NA
    3   .   .  .5  .5  .8   1
=============================

with rvars = c(r1, r2, r3, r4). If the rvars variables are a consecutive set in the data frame (as above), then one can use the shorthand notation rvars = c(r1:r4), so r1 denotes the first and r4 the last variable in the set. Note that only the lower triangular part of the submatrices defined by the rvars variables is used.

There must be as many variables specified via rvars as the number of rows in the largest cluster (in smaller clusters, the non-relevant variables can just be set to NA; see above).

Value

A \(k \times k\) variance-covariance matrix (given as a sparse matrix when sparse=TRUE), where \(k\) denotes the length of the vi variable (i.e., the number of rows in the dataset).

Note

Depending on the data structure, the specified variables, and the specified values for rho and/or phi, it is possible that the constructed variance-covariance matrix is not positive definite within one or more clusters (this is checked when checkpd=TRUE, which is the default). If such non-positive definite submatrices are found, the reasons for this should be carefully checked since this might indicate misapplication of the function and/or the specification of implausible values for rho and/or phi.

When setting nearpd=TRUE, the nearPD function from the Matrix package is used on variance-covariance submatrices that are not positive definite. This should only be used cautiously and after understanding why these matrices are not positive definite.

References

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. https://doi.org/10.18637/jss.v036.i03

See also

escalc for a function to compute the observed effect sizes or outcomes (and corresponding sampling variances) for which a variance-covariance matrix could be constructed.

rcalc for a function to construct the variance-covariance matrix of dependent correlation coefficients.

rma.mv for a model fitting function that can be used to meta-analyze dependent effect sizes or outcomes.

Examples

############################################################################

### see help(dat.assink2016) for further details on this dataset

dat <- dat.assink2016
head(dat, 9)
#> 
#>   study esid id      yi     vi pubstatus year deltype 
#> 1     1    1  1  0.9066 0.0740         1  4.5 general 
#> 2     1    2  2  0.4295 0.0398         1  4.5 general 
#> 3     1    3  3  0.2679 0.0481         1  4.5 general 
#> 4     1    4  4  0.2078 0.0239         1  4.5 general 
#> 5     1    5  5  0.0526 0.0331         1  4.5 general 
#> 6     1    6  6 -0.0507 0.0886         1  4.5 general 
#> 7     2    1  7  0.5117 0.0115         1  1.5 general 
#> 8     2    2  8  0.4738 0.0076         1  1.5 general 
#> 9     2    3  9  0.3544 0.0065         1  1.5 general 
#> 

### assume that the effect sizes within studies are correlated with rho=0.6
V <- vcalc(vi, cluster=study, obs=esid, data=dat, rho=0.6)

### show part of V matrix for studies 1 and 2
round(V[dat$study %in% c(1,2), dat$study %in% c(1,2)], 4)
#>         [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]
#>  [1,] 0.0740 0.0326 0.0358 0.0252 0.0297 0.0486 0.0000 0.0000 0.0000
#>  [2,] 0.0326 0.0398 0.0263 0.0185 0.0218 0.0356 0.0000 0.0000 0.0000
#>  [3,] 0.0358 0.0263 0.0481 0.0203 0.0239 0.0392 0.0000 0.0000 0.0000
#>  [4,] 0.0252 0.0185 0.0203 0.0239 0.0169 0.0276 0.0000 0.0000 0.0000
#>  [5,] 0.0297 0.0218 0.0239 0.0169 0.0331 0.0325 0.0000 0.0000 0.0000
#>  [6,] 0.0486 0.0356 0.0392 0.0276 0.0325 0.0886 0.0000 0.0000 0.0000
#>  [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0115 0.0056 0.0052
#>  [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0056 0.0076 0.0042
#>  [9,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0052 0.0042 0.0065

### or show as list of matrices
blsplit(V, dat$study, round, 4)[1:2]
#> $`1`
#>        [,1]   [,2]   [,3]   [,4]   [,5]   [,6]
#> [1,] 0.0740 0.0326 0.0358 0.0252 0.0297 0.0486
#> [2,] 0.0326 0.0398 0.0263 0.0185 0.0218 0.0356
#> [3,] 0.0358 0.0263 0.0481 0.0203 0.0239 0.0392
#> [4,] 0.0252 0.0185 0.0203 0.0239 0.0169 0.0276
#> [5,] 0.0297 0.0218 0.0239 0.0169 0.0331 0.0325
#> [6,] 0.0486 0.0356 0.0392 0.0276 0.0325 0.0886
#> 
#> $`2`
#>        [,1]   [,2]   [,3]
#> [1,] 0.0115 0.0056 0.0052
#> [2,] 0.0056 0.0076 0.0042
#> [3,] 0.0052 0.0042 0.0065
#> 

### use a correlation of 0.7 for effect sizes corresponding to the same type of
### delinquent behavior and a correlation of 0.5 for effect sizes corresponding
### to different types of delinquent behavior
V <- vcalc(vi, cluster=study, type=deltype, obs=esid, data=dat, rho=c(0.7, 0.5))
blsplit(V, dat$study, round, 3)[16]
#> $`16`
#>        [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
#>  [1,] 0.091 0.045 0.027 0.044 0.030 0.039 0.076 0.028 0.034 0.030 0.039 0.043 0.039 0.067 0.028 0.032
#>  [2,] 0.045 0.087 0.027 0.061 0.030 0.039 0.053 0.027 0.047 0.041 0.053 0.059 0.053 0.046 0.038 0.043
#>  [3,] 0.027 0.027 0.033 0.027 0.025 0.033 0.033 0.023 0.021 0.018 0.023 0.026 0.023 0.029 0.017 0.019
#>  [4,] 0.044 0.061 0.027 0.086 0.029 0.038 0.053 0.027 0.047 0.041 0.053 0.058 0.053 0.046 0.038 0.043
#>  [5,] 0.030 0.030 0.025 0.029 0.040 0.037 0.036 0.026 0.023 0.020 0.026 0.028 0.026 0.031 0.018 0.021
#>  [6,] 0.039 0.039 0.033 0.038 0.037 0.068 0.047 0.033 0.030 0.026 0.034 0.037 0.034 0.041 0.024 0.027
#>  [7,] 0.076 0.053 0.033 0.053 0.036 0.047 0.129 0.033 0.041 0.035 0.046 0.051 0.046 0.079 0.033 0.037
#>  [8,] 0.028 0.027 0.023 0.027 0.026 0.033 0.033 0.033 0.021 0.018 0.023 0.026 0.024 0.029 0.017 0.019
#>  [9,] 0.034 0.047 0.021 0.047 0.023 0.030 0.041 0.021 0.052 0.031 0.041 0.045 0.041 0.036 0.029 0.033
#> [10,] 0.030 0.041 0.018 0.041 0.020 0.026 0.035 0.018 0.031 0.039 0.036 0.039 0.036 0.031 0.025 0.029
#> [11,] 0.039 0.053 0.023 0.053 0.026 0.034 0.046 0.023 0.041 0.036 0.066 0.051 0.047 0.040 0.033 0.038
#> [12,] 0.043 0.059 0.026 0.058 0.028 0.037 0.051 0.026 0.045 0.039 0.051 0.081 0.051 0.045 0.037 0.042
#> [13,] 0.039 0.053 0.023 0.053 0.026 0.034 0.046 0.024 0.041 0.036 0.047 0.051 0.067 0.041 0.033 0.038
#> [14,] 0.067 0.046 0.029 0.046 0.031 0.041 0.079 0.029 0.036 0.031 0.040 0.045 0.041 0.099 0.029 0.033
#> [15,] 0.028 0.038 0.017 0.038 0.018 0.024 0.033 0.017 0.029 0.025 0.033 0.037 0.033 0.029 0.034 0.027
#> [16,] 0.032 0.043 0.019 0.043 0.021 0.027 0.037 0.019 0.033 0.029 0.038 0.042 0.038 0.033 0.027 0.044
#> 

### examine the correlation matrix for study 16
blsplit(V, dat$study, cov2cor)[16]
#> $`16`
#>       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
#>  [1,]  1.0  0.5  0.5  0.5  0.5  0.5  0.7  0.5  0.5   0.5   0.5   0.5   0.5   0.7   0.5   0.5
#>  [2,]  0.5  1.0  0.5  0.7  0.5  0.5  0.5  0.5  0.7   0.7   0.7   0.7   0.7   0.5   0.7   0.7
#>  [3,]  0.5  0.5  1.0  0.5  0.7  0.7  0.5  0.7  0.5   0.5   0.5   0.5   0.5   0.5   0.5   0.5
#>  [4,]  0.5  0.7  0.5  1.0  0.5  0.5  0.5  0.5  0.7   0.7   0.7   0.7   0.7   0.5   0.7   0.7
#>  [5,]  0.5  0.5  0.7  0.5  1.0  0.7  0.5  0.7  0.5   0.5   0.5   0.5   0.5   0.5   0.5   0.5
#>  [6,]  0.5  0.5  0.7  0.5  0.7  1.0  0.5  0.7  0.5   0.5   0.5   0.5   0.5   0.5   0.5   0.5
#>  [7,]  0.7  0.5  0.5  0.5  0.5  0.5  1.0  0.5  0.5   0.5   0.5   0.5   0.5   0.7   0.5   0.5
#>  [8,]  0.5  0.5  0.7  0.5  0.7  0.7  0.5  1.0  0.5   0.5   0.5   0.5   0.5   0.5   0.5   0.5
#>  [9,]  0.5  0.7  0.5  0.7  0.5  0.5  0.5  0.5  1.0   0.7   0.7   0.7   0.7   0.5   0.7   0.7
#> [10,]  0.5  0.7  0.5  0.7  0.5  0.5  0.5  0.5  0.7   1.0   0.7   0.7   0.7   0.5   0.7   0.7
#> [11,]  0.5  0.7  0.5  0.7  0.5  0.5  0.5  0.5  0.7   0.7   1.0   0.7   0.7   0.5   0.7   0.7
#> [12,]  0.5  0.7  0.5  0.7  0.5  0.5  0.5  0.5  0.7   0.7   0.7   1.0   0.7   0.5   0.7   0.7
#> [13,]  0.5  0.7  0.5  0.7  0.5  0.5  0.5  0.5  0.7   0.7   0.7   0.7   1.0   0.5   0.7   0.7
#> [14,]  0.7  0.5  0.5  0.5  0.5  0.5  0.7  0.5  0.5   0.5   0.5   0.5   0.5   1.0   0.5   0.5
#> [15,]  0.5  0.7  0.5  0.7  0.5  0.5  0.5  0.5  0.7   0.7   0.7   0.7   0.7   0.5   1.0   0.7
#> [16,]  0.5  0.7  0.5  0.7  0.5  0.5  0.5  0.5  0.7   0.7   0.7   0.7   0.7   0.5   0.7   1.0
#> 

############################################################################

### see help(dat.ishak2007) for further details on this dataset

dat <- dat.ishak2007
head(dat, 5)
#> 
#>               study   y1i  v1i   y2i   v2i   y3i v3i y4i v4i mdur mbase 
#> 1    Alegret (2001) -33.4 14.3    NA    NA    NA  NA  NA  NA 16.1  53.6 
#> 2 Barichella (2003) -20.0  7.3    NA    NA -30.0 5.7  NA  NA 13.5  45.3 
#> 3     Berney (2002) -21.1  7.3    NA    NA    NA  NA  NA  NA 13.6  45.6 
#> 4   Burchiel (1999) -20.0  8.0 -20.0   8.0 -18.0 5.0  NA  NA 13.6  48.0 
#> 5       Chen (2003)    NA   NA -32.9 125.0    NA  NA  NA  NA 12.1  65.7 
#> 

### create long format dataset
dat <- reshape(dat, direction="long", idvar="study", v.names=c("yi","vi"),
               varying=list(c(2,4,6,8), c(3,5,7,9)))
dat <- dat[order(study, time),]

### remove missing measurement occasions from dat
dat <- dat[!is.na(yi),]
rownames(dat) <- NULL

### show the data for the first 5 studies
head(dat, 8)
#> 
#>               study mdur mbase time    yi    vi 
#> 1    Alegret (2001) 16.1  53.6    1 -33.4  14.3 
#> 2 Barichella (2003) 13.5  45.3    1 -20.0   7.3 
#> 3 Barichella (2003) 13.5  45.3    3 -30.0   5.7 
#> 4     Berney (2002) 13.6  45.6    1 -21.1   7.3 
#> 5   Burchiel (1999) 13.6  48.0    1 -20.0   8.0 
#> 6   Burchiel (1999) 13.6  48.0    2 -20.0   8.0 
#> 7   Burchiel (1999) 13.6  48.0    3 -18.0   5.0 
#> 8       Chen (2003) 12.1  65.7    2 -32.9 125.0 
#> 

### construct the full (block diagonal) V matrix with an AR(1) structure
### assuming an autocorrelation of 0.97 as estimated by Ishak et al. (2007)
V <- vcalc(vi, cluster=study, time1=time, phi=0.97, data=dat)
V[1:8, 1:8]
#>      [,1]     [,2]     [,3] [,4]     [,5]     [,6]     [,7] [,8]
#> [1,] 14.3 0.000000 0.000000  0.0 0.000000 0.000000 0.000000    0
#> [2,]  0.0 7.300000 6.069352  0.0 0.000000 0.000000 0.000000    0
#> [3,]  0.0 6.069352 5.700000  0.0 0.000000 0.000000 0.000000    0
#> [4,]  0.0 0.000000 0.000000  7.3 0.000000 0.000000 0.000000    0
#> [5,]  0.0 0.000000 0.000000  0.0 8.000000 7.760000 5.950774    0
#> [6,]  0.0 0.000000 0.000000  0.0 7.760000 8.000000 6.134819    0
#> [7,]  0.0 0.000000 0.000000  0.0 5.950774 6.134819 5.000000    0
#> [8,]  0.0 0.000000 0.000000  0.0 0.000000 0.000000 0.000000  125
cov2cor(V[1:8, 1:8])
#>      [,1]   [,2]   [,3] [,4]   [,5] [,6]   [,7] [,8]
#> [1,]    1 0.0000 0.0000    0 0.0000 0.00 0.0000    0
#> [2,]    0 1.0000 0.9409    0 0.0000 0.00 0.0000    0
#> [3,]    0 0.9409 1.0000    0 0.0000 0.00 0.0000    0
#> [4,]    0 0.0000 0.0000    1 0.0000 0.00 0.0000    0
#> [5,]    0 0.0000 0.0000    0 1.0000 0.97 0.9409    0
#> [6,]    0 0.0000 0.0000    0 0.9700 1.00 0.9700    0
#> [7,]    0 0.0000 0.0000    0 0.9409 0.97 1.0000    0
#> [8,]    0 0.0000 0.0000    0 0.0000 0.00 0.0000    1

### or show as a list of matrices
blsplit(V, dat$study)[1:5]
#> $`Alegret (2001)`
#>      [,1]
#> [1,] 14.3
#> 
#> $`Barichella (2003)`
#>          [,1]     [,2]
#> [1,] 7.300000 6.069352
#> [2,] 6.069352 5.700000
#> 
#> $`Berney (2002)`
#>      [,1]
#> [1,]  7.3
#> 
#> $`Burchiel (1999)`
#>          [,1]     [,2]     [,3]
#> [1,] 8.000000 7.760000 5.950774
#> [2,] 7.760000 8.000000 6.134819
#> [3,] 5.950774 6.134819 5.000000
#> 
#> $`Chen (2003)`
#>      [,1]
#> [1,]  125
#> 
blsplit(V, dat$study, cov2cor)[1:5]
#> $`Alegret (2001)`
#>      [,1]
#> [1,]    1
#> 
#> $`Barichella (2003)`
#>        [,1]   [,2]
#> [1,] 1.0000 0.9409
#> [2,] 0.9409 1.0000
#> 
#> $`Berney (2002)`
#>      [,1]
#> [1,]    1
#> 
#> $`Burchiel (1999)`
#>        [,1] [,2]   [,3]
#> [1,] 1.0000 0.97 0.9409
#> [2,] 0.9700 1.00 0.9700
#> [3,] 0.9409 0.97 1.0000
#> 
#> $`Chen (2003)`
#>      [,1]
#> [1,]    1
#> 

############################################################################

### see help(dat.kalaian1996) for further details on this dataset

dat <- dat.kalaian1996
head(dat, 12)
#>    id                 study year n1i n2i outcome    yi     vi  hrs ets homework type
#> 1   1 Alderman & Powers (A) 1980  28  22  verbal  0.22 0.0817  7.0   1        1    1
#> 2   2 Alderman & Powers (B) 1980  39  40  verbal  0.09 0.0507 10.0   1        1    1
#> 3   3 Alderman & Powers (C) 1980  22  17  verbal  0.14 0.1045 10.5   1        1    1
#> 4   4 Alderman & Powers (D) 1980  48  43  verbal  0.14 0.0442 10.0   1        1    1
#> 5   5 Alderman & Powers (E) 1980  25  74  verbal -0.01 0.0535  6.0   1        1    1
#> 6   6 Alderman & Powers (F) 1980  37  35  verbal  0.14 0.0557  5.0   1        1    1
#> 7   7 Alderman & Powers (G) 1980  24  70  verbal  0.18 0.0561 11.0   1        1    1
#> 8   8 Alderman & Powers (H) 1980  16  19  verbal  0.01 0.1151 45.0   1        1    1
#> 9   9      Evans & Pike (A) 1973 145 129  verbal  0.13 0.0147 21.0   1        1    1
#> 10 10      Evans & Pike (A) 1973 145 129    math  0.12 0.0147 21.0   1        1    1
#> 11 11      Evans & Pike (B) 1973  72 129  verbal  0.25 0.0218 21.0   1        1    1
#> 12 12      Evans & Pike (B) 1973  72 129    math  0.06 0.0216 21.0   1        1    1

### construct the variance-covariance matrix assuming rho = 0.66 for effect sizes
### corresponding to the 'verbal' and 'math' outcome types
V <- vcalc(vi, cluster=study, type=outcome, data=dat, rho=0.66)
round(V[1:12,1:12], 4)
#>         [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]  [,11]  [,12]
#>  [1,] 0.0817 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [2,] 0.0000 0.0507 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [3,] 0.0000 0.0000 0.1045 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [4,] 0.0000 0.0000 0.0000 0.0442 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [5,] 0.0000 0.0000 0.0000 0.0000 0.0535 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [6,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0557 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0561 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1151 0.0000 0.0000 0.0000 0.0000
#>  [9,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0147 0.0097 0.0000 0.0000
#> [10,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0097 0.0147 0.0000 0.0000
#> [11,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0218 0.0143
#> [12,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0143 0.0216

############################################################################

### see help(dat.berkey1998) for further details on this dataset

dat <- dat.berkey1998

### variables v1i and v2i correspond to the 2x2 var-cov matrices of the studies;
### so use these variables to construct the V matrix (note: since v1i and v2i are
### var-cov matrices and not correlation matrices, set vi=1 for all rows)
V <- vcalc(vi=1, cluster=author, rvars=c(v1i, v2i), data=dat)
V
#>         [,1]   [,2]   [,3]  [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]
#>  [1,] 0.0075 0.0030 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [2,] 0.0030 0.0077 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [3,] 0.0000 0.0000 0.0057 9e-04 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [4,] 0.0000 0.0000 0.0009 8e-04 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [5,] 0.0000 0.0000 0.0000 0e+00 0.0021 0.0007 0.0000 0.0000 0.0000 0.0000
#>  [6,] 0.0000 0.0000 0.0000 0e+00 0.0007 0.0014 0.0000 0.0000 0.0000 0.0000
#>  [7,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0029 0.0009 0.0000 0.0000
#>  [8,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0009 0.0015 0.0000 0.0000
#>  [9,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0148 0.0072
#> [10,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0072 0.0304
round(cov2cor(V), 2)
#>       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#>  [1,] 1.00 0.39 0.00 0.00 0.00 0.00 0.00 0.00 0.00  0.00
#>  [2,] 0.39 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00  0.00
#>  [3,] 0.00 0.00 1.00 0.42 0.00 0.00 0.00 0.00 0.00  0.00
#>  [4,] 0.00 0.00 0.42 1.00 0.00 0.00 0.00 0.00 0.00  0.00
#>  [5,] 0.00 0.00 0.00 0.00 1.00 0.41 0.00 0.00 0.00  0.00
#>  [6,] 0.00 0.00 0.00 0.00 0.41 1.00 0.00 0.00 0.00  0.00
#>  [7,] 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.43 0.00  0.00
#>  [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.43 1.00 0.00  0.00
#>  [9,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00  0.34
#> [10,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.34  1.00

### or show as a list of matrices
blsplit(V, dat$author, function(x) round(cov2cor(x), 2))
#> $`Pihlstrom et al.`
#>      [,1] [,2]
#> [1,] 1.00 0.39
#> [2,] 0.39 1.00
#> 
#> $`Lindhe et al.`
#>      [,1] [,2]
#> [1,] 1.00 0.42
#> [2,] 0.42 1.00
#> 
#> $`Knowles et al.`
#>      [,1] [,2]
#> [1,] 1.00 0.41
#> [2,] 0.41 1.00
#> 
#> $`Ramfjord et al.`
#>      [,1] [,2]
#> [1,] 1.00 0.43
#> [2,] 0.43 1.00
#> 
#> $`Becker et al.`
#>      [,1] [,2]
#> [1,] 1.00 0.34
#> [2,] 0.34 1.00
#> 

### construct the variance-covariance matrix assuming rho = 0.4 for effect sizes
### corresponding to the 'PD' and 'AL' outcome types
V <- vcalc(vi=vi, cluster=trial, type=outcome, data=dat, rho=0.4)
round(V,4)
#>         [,1]   [,2]   [,3]  [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]
#>  [1,] 0.0075 0.0030 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [2,] 0.0030 0.0077 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [3,] 0.0000 0.0000 0.0057 9e-04 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [4,] 0.0000 0.0000 0.0009 8e-04 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
#>  [5,] 0.0000 0.0000 0.0000 0e+00 0.0021 0.0007 0.0000 0.0000 0.0000 0.0000
#>  [6,] 0.0000 0.0000 0.0000 0e+00 0.0007 0.0014 0.0000 0.0000 0.0000 0.0000
#>  [7,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0029 0.0008 0.0000 0.0000
#>  [8,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0008 0.0015 0.0000 0.0000
#>  [9,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0148 0.0085
#> [10,] 0.0000 0.0000 0.0000 0e+00 0.0000 0.0000 0.0000 0.0000 0.0085 0.0304
cov2cor(V)
#>       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#>  [1,]  1.0  0.4  0.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0
#>  [2,]  0.4  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   0.0
#>  [3,]  0.0  0.0  1.0  0.4  0.0  0.0  0.0  0.0  0.0   0.0
#>  [4,]  0.0  0.0  0.4  1.0  0.0  0.0  0.0  0.0  0.0   0.0
#>  [5,]  0.0  0.0  0.0  0.0  1.0  0.4  0.0  0.0  0.0   0.0
#>  [6,]  0.0  0.0  0.0  0.0  0.4  1.0  0.0  0.0  0.0   0.0
#>  [7,]  0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.4  0.0   0.0
#>  [8,]  0.0  0.0  0.0  0.0  0.0  0.0  0.4  1.0  0.0   0.0
#>  [9,]  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0   0.4
#> [10,]  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.4   1.0

############################################################################

### see help(dat.knapp2017) for further details on this dataset

dat <- dat.knapp2017
dat[-c(1:2)]
#>    study            task difficulty group1 group2 comp    yi    vi n_sz n_hc yi_iq vi_iq
#> 1      1    SOC (CANTAB)       3.83     SZ     HC    1  0.62 0.075   24   33  0.72 0.076
#> 2      2    SOC (CANTAB)       3.83    SZ1     HC    1  0.46 0.047   44   44    NA    NA
#> 3      2    SOC (CANTAB)       3.83    SZ2     HC    2  0.73 0.052   38   44    NA    NA
#> 4      3    SOC (CANTAB)       3.83    SZ1     HC    1  1.18 0.062   39   37    NA    NA
#> 5      3    SOC (CANTAB)       3.83    SZ2     HC    2  1.09 0.073   27   37    NA    NA
#> 6      3    SOC (CANTAB)       3.83    SZ3     HC    3  0.91 0.102   15   37    NA    NA
#> 7      3    SOC (CANTAB)       3.83    SZ4     HC    4  0.91 0.084   20   37    NA    NA
#> 8      4             TOH         NA     SZ     HC    1  0.57 0.074   28   28    NA    NA
#> 9      5             TOL         NA     SZ     HC    1  0.40 0.180   13   10    NA    NA
#> 10     6    SOC (CANTAB)       2.00     SZ     HC    1  0.43 0.170   12   12  0.12 0.167
#> 11     6    SOC (CANTAB)       3.00     SZ     HC    2  1.37 0.206   12   12  0.12 0.167
#> 12     6    SOC (CANTAB)       4.00     SZ     HC    3  0.23 0.168   12   12  0.12 0.167
#> 13     6    SOC (CANTAB)       5.00     SZ     HC    4  0.97 0.186   12   12  0.12 0.167
#> 14     6 SOC (one-touch)       2.00     SZ     HC    5  0.56 0.173   12   12  0.12 0.167
#> 15     6 SOC (one-touch)       3.00     SZ     HC    6  0.79 0.180   12   12  0.12 0.167
#> 16     6 SOC (one-touch)       4.00     SZ     HC    7  1.19 0.196   12   12  0.12 0.167
#> 17     6 SOC (one-touch)       5.00     SZ     HC    8  0.92 0.184   12   12  0.12 0.167
#> 18     7             SOC       3.50     SZ     HC    1  0.20 0.096   22   20  0.62 0.100
#> 19     8             TOH         NA     SZ     HC    1  1.42 0.180   13   15    NA    NA
#> 20     9             TOL       4.00    SZ1     HC    1  0.35 0.074   27   28  1.21 0.086
#> 21     9             TOL       4.00    SZ2     HC    2  0.88 0.078   28   28  1.26 0.086
#> 22    10    SOC (CANTAB)       3.83     SZ     HC    1  1.08 0.079   26   33  0.76 0.074
#> 23    11    SOC (CANTAB)       3.83     SZ     HC    1  0.94 0.111   20   20  0.42 0.102
#> 24    12    SOC (CANTAB)       2.00     SZ     HC    1  0.44 0.020  135   81  0.49 0.020
#> 25    12    SOC (CANTAB)       3.00     SZ     HC    2  0.63 0.021  135   81  0.49 0.020
#> 26    12    SOC (CANTAB)       4.00     SZ     HC    3  0.38 0.020  135   81  0.49 0.020
#> 27    12    SOC (CANTAB)       5.00     SZ     HC    4  0.93 0.022  135   81  0.49 0.020
#> 28    13             TOL         NA     SZ     HC    1  0.33 0.102   24   17    NA    NA
#> 29    14             TOL       2.00     SZ     HC    1  0.00 0.073   32   24    NA    NA
#> 30    14             TOL       3.00     SZ     HC    2 -0.06 0.073   32   24    NA    NA
#> 31    14             TOL       4.00     SZ     HC    3  0.73 0.078   32   24    NA    NA
#> 32    14             TOL       5.00     SZ     HC    4  0.48 0.075   32   24    NA    NA
#> 33    15             TOL       2.00     SZ     HC    1  0.43 0.092   25   20    NA    NA
#> 34    15             TOL       3.00     SZ     HC    2  0.72 0.096   25   20    NA    NA
#> 35    15             TOL       4.00     SZ     HC    3  0.80 0.097   25   20    NA    NA
#> 36    15             TOL       5.00     SZ     HC    4  1.30 0.109   25   20    NA    NA
#> 37    16             TOL       4.67     SZ     HC    1  1.84 0.190   15   15    NA    NA
#> 38    17             TOL       3.50     SZ     HC    1  2.58 0.216   17   17  0.58 0.123
#> 39    18             TOL       3.00     SZ     HC    1  0.25 0.071   30   27  0.18 0.071
#> 40    18             TOL       4.00     SZ     HC    2  0.86 0.077   30   27  0.18 0.071
#> 41    18             TOL       5.00     SZ     HC    3  0.68 0.074   30   27  0.18 0.071
#> 42    19    SOC (CANTAB)       2.00     SZ     HC    1  0.21 0.060   36   31  0.48 0.062
#> 43    19    SOC (CANTAB)       3.00     SZ     HC    2  0.65 0.063   36   31  0.48 0.062
#> 44    19    SOC (CANTAB)       4.00     SZ     HC    3  0.30 0.061   36   31  0.48 0.062
#> 45    19    SOC (CANTAB)       5.00     SZ     HC    4  0.45 0.062   36   31  0.48 0.062
#> 46    20          BACS-J         NA    SZ1     HC    1  1.03 0.054   20  340    NA    NA
#> 47    20          BACS-J         NA    SZ2     HC    2  0.92 0.104   10  340    NA    NA
#> 48    21    SOC (CANTAB)       3.83     SZ     HC    1  1.12 0.108   28   17    NA    NA
#> 49    22             TOL       2.00     SZ     HC    1  0.52 0.052   40   40    NA    NA
#> 50    22             TOL       3.00     SZ     HC    2  0.50 0.052   40   40    NA    NA
#> 51    22             TOL       4.00     SZ     HC    3  0.49 0.051   40   40    NA    NA
#> 52    23             SOC       3.83     SZ     HC    1  0.30 0.042   48   48    NA    NA
#> 53    24      TOL-Drexel       5.50    SZ1    HC1    1  0.36 0.022   86   97  0.47 0.023
#> 54    24      TOL-Drexel       5.50    SZ2    HC2    2  0.41 0.030   75   62  0.10 0.030
#> 55    25             SOC       3.83     SZ     HC    1  0.79 0.086   25   25    NA    NA
#> 56    26             SOC       2.00     SZ     HC    1  0.23 0.031   77   55    NA    NA
#> 57    26             SOC       3.00     SZ     HC    2  0.58 0.032   77   55    NA    NA
#> 58    26             SOC       4.00     SZ     HC    3  0.57 0.032   77   55    NA    NA
#> 59    26             SOC       5.00     SZ     HC    4  0.79 0.034   77   55    NA    NA
#> 60    27      TOL-Drexel       5.50     SZ     HC    1  0.70 0.071   30   30    NA    NA
#> 61    28             SOC       3.83     SZ     HC    1  1.16 0.136   20   15  0.23 0.117
#> 62    29          BACS-J         NA    SZ1    HC1    1  1.29 0.039   56   68  0.37 0.033
#> 63    29          BACS-J         NA    SZ2    HC2    2  0.63 0.039   47   62  0.43 0.038
#> 64    29          BACS-J         NA    SZ3    HC3    3  0.31 0.121   15   19  1.00 0.134
#> 65    30             TOL       3.83     SZ     HC    1  0.11 0.070   30   27    NA    NA
#> 66    31 TOL (four-disk)       5.60     SZ     HC    1  0.87 0.036   60   60    NA    NA

### create variable that indicates the task and difficulty combination as increasing integers
dat$task.diff <- unlist(lapply(split(dat, dat$study), function(x) {
   task.int <- as.integer(factor(x$task))
   diff.int <- as.integer(factor(x$difficulty))
   diff.int[is.na(diff.int)] <- 1
   paste0(task.int, ".", diff.int)}))

### construct correlation matrix for two tasks with four different difficulties where the
### correlation is 0.4 for different difficulties of the same task, 0.7 for the same
### difficulty of different tasks, and 0.28 for different difficulties of different tasks
R <- matrix(0.4, nrow=8, ncol=8)
R[5:8,1:4] <- R[1:4,5:8] <- 0.28
diag(R[1:4,5:8]) <- 0.7
diag(R[5:8,1:4]) <- 0.7
diag(R) <- 1
rownames(R) <- colnames(R) <- paste0(rep(1:2, each=4), ".", 1:4)
R
#>      1.1  1.2  1.3  1.4  2.1  2.2  2.3  2.4
#> 1.1 1.00 0.40 0.40 0.40 0.70 0.28 0.28 0.28
#> 1.2 0.40 1.00 0.40 0.40 0.28 0.70 0.28 0.28
#> 1.3 0.40 0.40 1.00 0.40 0.28 0.28 0.70 0.28
#> 1.4 0.40 0.40 0.40 1.00 0.28 0.28 0.28 0.70
#> 2.1 0.70 0.28 0.28 0.28 1.00 0.40 0.40 0.40
#> 2.2 0.28 0.70 0.28 0.28 0.40 1.00 0.40 0.40
#> 2.3 0.28 0.28 0.70 0.28 0.40 0.40 1.00 0.40
#> 2.4 0.28 0.28 0.28 0.70 0.40 0.40 0.40 1.00

### construct an approximate V matrix accounting for the use of shared groups and
### for correlations among tasks/difficulties as specified in the R matrix above
V <- vcalc(vi, cluster=study, grp1=group1, grp2=group2, w1=n_sz, w2=n_hc,
           obs=task.diff, rho=R, data=dat)
Vs <- blsplit(V, dat$study)
cov2cor(Vs[[3]])  # study with multiple SZ groups and a single HC group
#>           [,1]      [,2]      [,3]      [,4]
#> [1,] 1.0000000 0.4652832 0.3847419 0.4243294
#> [2,] 0.4652832 1.0000000 0.3488477 0.3847419
#> [3,] 0.3847419 0.3488477 1.0000000 0.3181424
#> [4,] 0.4243294 0.3847419 0.3181424 1.0000000
cov2cor(Vs[[6]])  # study with two task types and multiple difficulties
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1.00 0.40 0.40 0.40 0.70 0.28 0.28 0.28
#> [2,] 0.40 1.00 0.40 0.40 0.28 0.70 0.28 0.28
#> [3,] 0.40 0.40 1.00 0.40 0.28 0.28 0.70 0.28
#> [4,] 0.40 0.40 0.40 1.00 0.28 0.28 0.28 0.70
#> [5,] 0.70 0.28 0.28 0.28 1.00 0.40 0.40 0.40
#> [6,] 0.28 0.70 0.28 0.28 0.40 1.00 0.40 0.40
#> [7,] 0.28 0.28 0.70 0.28 0.40 0.40 1.00 0.40
#> [8,] 0.28 0.28 0.28 0.70 0.40 0.40 0.40 1.00
cov2cor(Vs[[12]]) # study with multiple difficulties for the same task
#>      [,1] [,2] [,3] [,4]
#> [1,]  1.0  0.4  0.4  0.4
#> [2,]  0.4  1.0  0.4  0.4
#> [3,]  0.4  0.4  1.0  0.4
#> [4,]  0.4  0.4  0.4  1.0
cov2cor(Vs[[24]]) # study with separate rows for males and females
#>      [,1] [,2]
#> [1,]    1    0
#> [2,]    0    1
cov2cor(Vs[[29]]) # study with separate rows for three genotypes
#>      [,1] [,2] [,3]
#> [1,]    1    0    0
#> [2,]    0    1    0
#> [3,]    0    0    1

############################################################################