Studies on the Association between Circumcision and HIV Infection

Results from 33 studies examining the association between male circumcision and HIV infection.

dat.vanhowe1999

Format

The data frame contains the following columns:

study	`character`	study author
category	`character`	study type (high-risk group, partner study, or population survey)
non.pos	`numeric`	number of non-circumcised HIV positive cases
non.neg	`numeric`	number of non-circumcised HIV negative cases
cir.pos	`numeric`	number of circumcised HIV positive cases
cir.neg	`numeric`	number of circumcised HIV negative cases

Details

The 33 studies provide data in terms of \(2 \times 2\) tables in the form:

	HIV positive	HIV negative
non-circumcised	`non.pos`	`non.neg`
circumcised	`cir.pos`	`cir.neg`

The goal of the meta-analysis was to examine if the risk of an HIV infection differs between non-circumcised versus circumcised men.

The dataset is interesting because it can be used to illustrate the difference between naively pooling results by summing up the counts across studies and then computing the odds ratio based on the aggregated table (as was done by Van Howe, 1999) and conducting a proper meta-analysis (as illustrated by O'Farrell & Egger, 2000). In fact, a proper meta-analysis shows that the HIV infection risk is on average higher in non-circumcised men, which is the opposite of what the naive pooling approach yields (which makes this an illustration of Simpson's paradox).

Source

Van Howe, R. S. (1999). Circumcision and HIV infection: Review of the literature and meta-analysis. International Journal of STD & AIDS, 10(1), 8–16. https://doi.org/10.1258/0956462991913015

References

O'Farrell, N., & Egger, M. (2000). Circumcision in men and the prevention of HIV infection: A 'meta-analysis' revisited. International Journal of STD & AIDS, 11(3), 137–142. https://doi.org/10.1258/0956462001915480

Author

Wolfgang Viechtbauer, wvb@metafor-project.org, https://www.metafor-project.org

Concepts

medicine, epidemiology, odds ratios

Examples

### copy data into 'dat' and examine data
dat <- dat.vanhowe1999
dat
#>           study          category non.pos non.neg cir.pos cir.neg
#> 1       Bwayo 1   high-risk group      92      86     160     612
#> 2       Bwayo 2   high-risk group      22      46      37     200
#> 3        Kreiss   high-risk group      59      18     254     168
#> 4          Hira   high-risk group     418     172      10      10
#> 5       Cameron   high-risk group      18      61       6     208
#> 6         Pepin   high-risk group       5      42      13     243
#> 7    Greenblatt   high-risk group      11      28       8      68
#> 8        Diallo   high-risk group      38      46     212     873
#> 9      Simonsen   high-risk group      17      70      21     232
#> 10      Tyndall   high-risk group      85      93     105     527
#> 11        Nasio   high-risk group      86      78     137     580
#> 12     Mehendal   high-risk group     837    3411      38     253
#> 13    Bollinger   high-risk group      50     241       1      14
#> 14     Chiasson   high-risk group      36     797      14     542
#> 15       Sassan   high-risk group      75      18     415     221
#> 16       Hunter     partner study      43     330     165    3765
#> 17       Carael     partner study      90     105      34      45
#> 18         Chao     partner study     442    4844      75     232
#> 19         Moss     partner study      24      16      12      17
#> 20        Allen     partner study     275     612     132     324
#> 21       Sedlin     partner study      32      26      33      40
#> 22    Konde-Luc     partner study     153    1516       6     127
#> 23      Barongo population survey      55    1356      42     642
#> 24   Grosskurth population survey     158    4604      61    1026
#> 25 Van de Perre population survey      46     224       6      26
#> 26         Seed population survey     171     422      51     192
#> 27      Malamba population survey     111     114      21      47
#> 28      Quigley population survey     101     272      48     121
#> 29     Urassa 1 population survey      56    1301      42     600
#> 30     Urassa 2 population survey     105    2040      32     426
#> 31     Urassa 3 population survey      38     309      19     158
#> 32     Urassa 4 population survey     112     716      54     692
#> 33     Urassa 5 population survey     101     365      48     136

### load metafor package
library(metafor)

### naive pooling by summing up the counts within categories and then
### computing the odds ratios and corresponding confidence intervals
cat1 <- with(dat[dat$category=="high-risk group",],
   escalc(measure="OR", ai=sum(non.pos), bi=sum(non.neg), ci=sum(cir.pos), di=sum(cir.neg)))
cat2 <- with(dat[dat$category=="partner study",],
   escalc(measure="OR", ai=sum(non.pos), bi=sum(non.neg), ci=sum(cir.pos), di=sum(cir.neg)))
cat3 <- with(dat[dat$category=="population survey",],
   escalc(measure="OR", ai=sum(non.pos), bi=sum(non.neg), ci=sum(cir.pos), di=sum(cir.neg)))
summary(cat1, transf=exp, digits=2)
#> 
#>     yi ci.lb ci.ub 
#> 1 1.18  1.09  1.28 
#> 
summary(cat2, transf=exp, digits=2)
#> 
#>     yi ci.lb ci.ub 
#> 1 1.42  1.26  1.59 
#> 
summary(cat3, transf=exp, digits=2)
#> 
#>     yi ci.lb ci.ub 
#> 1 0.86  0.77  0.97 
#> 

### naive pooling across all studies
all <- escalc(measure="OR", ai=sum(dat$non.pos), bi=sum(dat$non.neg),
                            ci=sum(dat$cir.pos), di=sum(dat$cir.neg))
summary(all, transf=exp, digits=2)
#> 
#>     yi ci.lb ci.ub 
#> 1 0.94  0.89  0.99 
#> 

### calculate log odds ratios and corresponding sampling variances
dat <- escalc(measure="OR", ai=non.pos, bi=non.neg, ci=cir.pos, di=cir.neg, data=dat)
dat
#> 
#>           study          category non.pos non.neg cir.pos cir.neg      yi     vi 
#> 1       Bwayo 1   high-risk group      92      86     160     612  1.4090 0.0304 
#> 2       Bwayo 2   high-risk group      22      46      37     200  0.9498 0.0992 
#> 3        Kreiss   high-risk group      59      18     254     168  0.7738 0.0824 
#> 4          Hira   high-risk group     418     172      10      10  0.8880 0.2082 
#> 5       Cameron   high-risk group      18      61       6     208  2.3253 0.2434 
#> 6         Pepin   high-risk group       5      42      13     243  0.7999 0.3048 
#> 7    Greenblatt   high-risk group      11      28       8      68  1.2058 0.2663 
#> 8        Diallo   high-risk group      38      46     212     873  1.2243 0.0539 
#> 9      Simonsen   high-risk group      17      70      21     232  0.9869 0.1250 
#> 10      Tyndall   high-risk group      85      93     105     527  1.5233 0.0339 
#> 11        Nasio   high-risk group      86      78     137     580  1.5407 0.0335 
#> 12     Mehendal   high-risk group     837    3411      38     253  0.4909 0.0318 
#> 13    Bollinger   high-risk group      50     241       1      14  1.0663 1.0956 
#> 14     Chiasson   high-risk group      36     797      14     542  0.5589 0.1023 
#> 15       Sassan   high-risk group      75      18     415     221  0.7970 0.0758 
#> 16       Hunter     partner study      43     330     165    3765  1.0897 0.0326 
#> 17       Carael     partner study      90     105      34      45  0.1262 0.0723 
#> 18         Chao     partner study     442    4844      75     232 -1.2649 0.0201 
#> 19         Moss     partner study      24      16      12      17  0.7538 0.2463 
#> 20        Allen     partner study     275     612     132     324  0.0980 0.0159 
#> 21       Sedlin     partner study      32      26      33      40  0.4000 0.1250 
#> 22    Konde-Luc     partner study     153    1516       6     127  0.7590 0.1817 
#> 23      Barongo population survey      55    1356      42     642 -0.4780 0.0443 
#> 24   Grosskurth population survey     158    4604      61    1026 -0.5495 0.0239 
#> 25 Van de Perre population survey      46     224       6      26 -0.1167 0.2313 
#> 26         Seed population survey     171     422      51     192  0.4223 0.0330 
#> 27      Malamba population survey     111     114      21      47  0.7790 0.0867 
#> 28      Quigley population survey     101     272      48     121 -0.0661 0.0427 
#> 29     Urassa 1 population survey      56    1301      42     600 -0.4863 0.0441 
#> 30     Urassa 2 population survey     105    2040      32     426 -0.3780 0.0436 
#> 31     Urassa 3 population survey      38     309      19     158  0.0224 0.0885 
#> 32     Urassa 4 population survey     112     716      54     692  0.6954 0.0303 
#> 33     Urassa 5 population survey     101     365      48     136 -0.2433 0.0408 
#> 

### random-effects model
res <- rma(yi, vi, data=dat, method="DL")
res
#> 
#> Random-Effects Model (k = 33; tau^2 estimator: DL)
#> 
#> tau^2 (estimated amount of total heterogeneity): 0.6397 (SE = 0.2103)
#> tau (square root of estimated tau^2 value):      0.7998
#> I^2 (total heterogeneity / total variability):   92.36%
#> H^2 (total variability / sampling variability):  13.09
#> 
#> Test for Heterogeneity:
#> Q(df = 32) = 419.0087, p-val < .0001
#> 
#> Model Results:
#> 
#> estimate      se    zval    pval   ci.lb   ci.ub      
#>   0.5132  0.1499  3.4248  0.0006  0.2195  0.8069  *** 
#> 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
predict(res, transf=exp, digits=2)
#> 
#>  pred ci.lb ci.ub pi.lb pi.ub 
#>  1.67  1.25  2.24  0.34  8.23 
#> 

### random-effects model within subgroups
res <- rma(yi, vi, data=dat, method="DL", subset=category=="high-risk group")
predict(res, transf=exp, digits=2)
#> 
#>  pred ci.lb ci.ub pi.lb pi.ub 
#>  3.00  2.35  3.83  1.42  6.31 
#> 
res <- rma(yi, vi, data=dat, method="DL", subset=category=="partner study")
predict(res, transf=exp, digits=2)
#> 
#>  pred ci.lb ci.ub pi.lb pi.ub 
#>  1.29  0.62  2.69  0.18  9.45 
#> 
res <- rma(yi, vi, data=dat, method="DL", subset=category=="population survey")
predict(res, transf=exp, digits=2)
#> 
#>  pred ci.lb ci.ub pi.lb pi.ub 
#>  0.96  0.71  1.29  0.38  2.44 
#>