dat.mccurdy2020.Rd
Results from 126 articles that examined the so-called ‘generation effect’.
dat.mccurdy2020
The data frame contains the following columns:
article | numeric | article identifier |
experiment | character | experiment (within article) identifier |
sample | numeric | sample (within experiment) identifier |
id | numeric | row identifier |
pairing | numeric | identifier to indicate paired conditions within experiments |
yi | numeric | mean recall rate for the condition |
vi | numeric | corresponding sampling variance |
ni | numeric | number of participants for the condition |
stimuli | numeric | number of stimuli for the condition |
condition | factor | condition (‘read’ or ‘generate’) |
gen_difficulty | factor | generation difficulty (‘low’ or ‘high’) |
manip_type | factor | manipulation type of the generate versus read condition (using a ‘within’ or ‘between’ subjects design) |
present_style | factor | presentation style (‘mixed’ or ‘pure’ list presentation) |
word_status | factor | word status (‘words’, ‘non-words’, or ‘numbers’) |
memory_test | factor | memory test (‘recognition’, ‘cued recall’, or ‘free recall’) |
memory_type | factor | memory type (‘item’, ‘source’, ‘font color’, ‘font type’, ‘order’, ‘cue word’, ‘background color’, or ‘location’) |
gen_constraint | factor | generation constraint (‘low’, ‘medium’, or ‘high’) |
learning_type | factor | learning type (‘incidental’ or ‘intentional’) |
stimuli_relation | factor | stimuli relation (‘semantic’, ‘category’, ‘antonym’, ‘synonym’, ‘rhyme’, ‘compound words’, ‘definitions’, or ‘unrelated’) |
gen_mode | factor | generation mode (‘verbal/speaking’, ‘covert/thinking’, or ‘writing/typing’) |
gen_task | factor | generation task (‘anagram’, ‘letter transposition’, ‘word fragment’, ‘sentence completion’, ‘word stem’, ‘calculation’, or ‘cue only’) |
attention | factor | attention (‘divided’ or ‘full’) |
pacing | factor | pacing (‘self-paced’ or ‘timed’) |
filler_task | factor | filler task (‘yes’ or ‘no’) |
age_grp | factor | age group (‘younger’ or ‘older’ adults) |
retention_delay | factor | retention delay (‘immediate’, ‘short’, or ‘long’) |
The generation effect is the memory benefit for self-generated compared with read or experimenter-provided information (Jacoby, 1978; Slamecka & Graf, 1978). In a typical study, participants are presented with a list of stimuli (usually words or word pairs). For half of the stimuli, participants self-generate a target word (e.g., open–cl____), while for the other half, participants simply read an intact target word (e.g., above–below). On a later memory test for the target words, the common finding is that self-generated words are better remembered than read words (i.e., the generation effect).
Although several theories have been proposed to explain the generation effect, there is still some debate on the underlying memory mechanism(s) contributing to this phenomenon. The meta-analysis by McCurdy et al. (2020) translated various theories on the generation effect into hypotheses that could then be tested in moderator analyses based on a dataset containing 126 articles, 310 experiments, and 1653 mean recall estimates collected under various conditions.
Detailed explanations of the various variables coded (and how these can be used to test various hypotheses regarding the generation effect) can be found in the article. The most important variable is condition
, which denotes whether a particular row of the dataset corresponds to the results of a ‘read’ or a ‘generate’ condition.
The data structure is quite complex. Articles may have reported the findings from multiple experiments involving one or multiple samples that were examined under various conditions. The pairing
variable indicates which rows of the dataset represent a pairing of a read condition with one or multiple corresponding generate conditions within an experiment. A pairing may involve the same sample of subjects (when using a within-subjects design for comparing the conditions) or different samples (when using a between-subjects design).
McCurdy, M. P., Viechtbauer, W., Sklenar, A. M., Frankenstein, A. N., & Leshikar, E. D. (2020). Theories of the generation effect and the impact of generation constraint: A meta-analytic review. Psychonomic Bulletin & Review, 27(6), 1139–1165. https://doi.org/10.3758/s13423-020-01762-3
Slamecka, N. J., & Graf, P. (1978). The generation effect: Delineation of a phenomenon. Journal of Experimental Psychology: Human Learning and Memory, 4(6), 592–604. https://doi.org/10.1037/0278-7393.4.6.592
Jacoby, L. L. (1978). On interpreting the effects of repetition: Solving a problem versus remembering a solution. Journal of Verbal Learning and Verbal Behavior, 17(6), 649–668. https://doi.org/10.1016/S0022-5371(78)90393-6
psychology, memory, proportions, raw means, multilevel models, cluster-robust inference
### copy data into 'dat' and examine data
dat <- dat.mccurdy2020
head(dat)
#>
#> article experiment sample id pairing yi vi ni stimuli condition gen_difficulty manip_type
#> 1 12 1 1 1 1 0.8790 0.00068 12 20 generate <NA> between
#> 2 12 1 2 2 1 0.7130 0.00140 12 20 read <NA> between
#> 3 12 1 1 3 2 0.9060 0.00058 12 20 generate <NA> between
#> 4 12 1 2 4 2 0.6840 0.00151 12 20 read <NA> between
#> 5 12 1 1 5 3 0.8420 0.00084 12 20 generate <NA> between
#> 6 12 1 2 6 3 0.7530 0.00124 12 20 read <NA> between
#> present_style word_status memory_test memory_type gen_constraint learning_type stimuli_relation
#> 1 pure words recognition item medium intentional semantic
#> 2 pure words recognition item <NA> intentional semantic
#> 3 pure words recognition item high intentional category
#> 4 pure words recognition item <NA> intentional category
#> 5 pure words recognition item high intentional antonym
#> 6 pure words recognition item <NA> intentional antonym
#> gen_mode gen_task attention pacing filler_task age_grp retention_delay
#> 1 verbal/speaking word stem full <NA> no younger immediate
#> 2 verbal/speaking word stem full <NA> no younger immediate
#> 3 verbal/speaking word stem full <NA> no younger immediate
#> 4 verbal/speaking word stem full <NA> no younger immediate
#> 5 verbal/speaking word stem full <NA> no younger immediate
#> 6 verbal/speaking word stem full <NA> no younger immediate
#>
### load metafor package
library(metafor)
### fit multilevel mixed-effects meta-regression model
res <- rma.mv(yi, vi, mods = ~ condition,
random = list(~ 1 | article/experiment/sample/id, ~ 1 | pairing),
data=dat, sparse=TRUE, digits=3)
res
#>
#> Multivariate Meta-Analysis Model (k = 1653; method: REML)
#>
#> Variance Components:
#>
#> estim sqrt nlvls fixed factor
#> sigma^2.1 0.022 0.148 126 no article
#> sigma^2.2 0.006 0.078 310 no article/experiment
#> sigma^2.3 0.000 0.000 582 no article/experiment/sample
#> sigma^2.4 0.006 0.080 1653 no article/experiment/sample/id
#> sigma^2.5 0.017 0.128 804 no pairing
#>
#> Test for Residual Heterogeneity:
#> QE(df = 1651) = 211160.207, p-val < .001
#>
#> Test of Moderators (coefficient 2):
#> QM(df = 1) = 578.027, p-val < .001
#>
#> Model Results:
#>
#> estimate se zval pval ci.lb ci.ub
#> intrcpt 0.478 0.016 30.446 <.001 0.448 0.509 ***
#> conditiongenerate 0.102 0.004 24.042 <.001 0.094 0.110 ***
#>
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
### proportion of total amount of heterogeneity due to each component
data.frame(source=res$s.names, sigma2=round(res$sigma2, 3),
prop=round(res$sigma2 / sum(res$sigma2), 2))
#> source sigma2 prop
#> article article 0.022 0.43
#> article/experiment article/experiment 0.006 0.12
#> article/experiment/sample article/experiment/sample 0.000 0.00
#> article/experiment/sample/id article/experiment/sample/id 0.006 0.13
#> pairing pairing 0.017 0.33
### apply cluster-robust inference methods
sav <- robust(res, cluster=article, clubSandwich=TRUE)
sav
#>
#> Multivariate Meta-Analysis Model (k = 1653; method: REML)
#>
#> Variance Components:
#>
#> estim sqrt nlvls fixed factor
#> sigma^2.1 0.022 0.148 126 no article
#> sigma^2.2 0.006 0.078 310 no article/experiment
#> sigma^2.3 0.000 0.000 582 no article/experiment/sample
#> sigma^2.4 0.006 0.080 1653 no article/experiment/sample/id
#> sigma^2.5 0.017 0.128 804 no pairing
#>
#> Test for Residual Heterogeneity:
#> QE(df = 1651) = 211160.207, p-val < .001
#>
#> Number of estimates: 1653
#> Number of clusters: 126
#> Estimates per cluster: 2-48 (mean: 13.12, median: 9)
#>
#> Test of Moderators (coefficient 2):¹
#> F(df1 = 1, df2 = 74.7) = 192.517, p-val < .001
#>
#> Model Results:
#>
#> estimate se¹ tval¹ df¹ pval¹ ci.lb¹ ci.ub¹
#> intrcpt 0.478 0.016 29.382 120.17 <.001 0.446 0.511 ***
#> conditiongenerate 0.102 0.007 13.875 74.7 <.001 0.087 0.117 ***
#>
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> 1) results based on cluster-robust inference (var-cov estimator: CR2,
#> approx t/F-tests and confidence intervals, df: Satterthwaite approx)
#>
### estimated average recall rate in read and generate conditions
predict(sav, newmods = c(0,1), digits=3)
#>
#> pred se ci.lb ci.ub pi.lb pi.ub
#> 1 0.478 0.016 0.446 0.511 0.031 0.926
#> 2 0.581 0.016 0.549 0.612 0.133 1.028
#>