aggreg.Rd
Function that aggregates a dataset to the subject level.
aggreg(data, id, vars, grep=FALSE, na.rm=TRUE)
data frame to aggregate.
argument to specify a subject id variable.
optional character vector (giving the names of the variables to aggregate) or a numeric vector (giving the position of the columns in the data frame corresponding to the variables).
logical indicating whether variable names should be matched using grep
(default is FALSE
).
logical indicating whether missing values should be removed before aggregating the variables (default is TRUE
).
The function aggregates a dataset in the long format to the subject level. For numeric, integer, and logical variables, the subject-level means are computed. For factors and character variables, the first (non-missing) value is returned.
A data frame.
# illustrative dataset
dat <- data.frame(subj=rep(1:4, each=5),
sex = rep(c("male", "female"), each=2*5),
obs = 1:5,
age = rep(c(20,31,27,22), each=5),
stress = c(2,3,NA,4,2, 3,3,NA,3,NA, 1,1,2,6,4, 1,2,1,3,1))
dat
#> subj sex obs age stress
#> 1 1 male 1 20 2
#> 2 1 male 2 20 3
#> 3 1 male 3 20 NA
#> 4 1 male 4 20 4
#> 5 1 male 5 20 2
#> 6 2 male 1 31 3
#> 7 2 male 2 31 3
#> 8 2 male 3 31 NA
#> 9 2 male 4 31 3
#> 10 2 male 5 31 NA
#> 11 3 female 1 27 1
#> 12 3 female 2 27 1
#> 13 3 female 3 27 2
#> 14 3 female 4 27 6
#> 15 3 female 5 27 4
#> 16 4 female 1 22 1
#> 17 4 female 2 22 2
#> 18 4 female 3 22 1
#> 19 4 female 4 22 3
#> 20 4 female 5 22 1
# aggregate the dataset
aggreg(dat, subj)
#> subj sex obs age stress
#> 1 1 male 3 20 2.75
#> 2 2 male 3 31 3.00
#> 3 3 female 3 27 2.80
#> 4 4 female 3 22 1.60
# aggregate the dataset for selected variables
aggreg(dat, subj, vars=c("subj","stress"))
#> subj stress
#> 1 1 2.75
#> 2 2 3.00
#> 3 3 2.80
#> 4 4 1.60
# aggregate the dataset for selected variables
aggreg(dat, subj, vars=1:2)
#> subj sex
#> 1 1 male
#> 2 2 male
#> 3 3 female
#> 4 4 female