Function that aggregates a dataset to the subject level.

aggreg(data, id, vars, grep=FALSE, na.rm=TRUE)

Arguments

data

data frame to aggregate.

id

argument to specify a subject id variable.

vars

optional character vector (giving the names of the variables to aggregate) or a numeric vector (giving the position of the columns in the data frame corresponding to the variables).

grep

logical indicating whether variable names should be matched using grep (default is FALSE).

na.rm

logical indicating whether missing values should be removed before aggregating the variables (default is TRUE).

Details

The function aggregates a dataset in the long format to the subject level. For numeric, integer, and logical variables, the subject-level means are computed. For factors and character variables, the first (non-missing) value is returned.

Value

A data frame.

Author

Wolfgang Viechtbauer wvb@wvbauer.com

Examples

# illustrative dataset
dat <- data.frame(subj=rep(1:4, each=5),
                  sex = rep(c("male", "female"), each=2*5),
                  obs = 1:5,
                  age = rep(c(20,31,27,22), each=5),
                  stress = c(2,3,NA,4,2, 3,3,NA,3,NA, 1,1,2,6,4, 1,2,1,3,1))
dat
#>    subj    sex obs age stress
#> 1     1   male   1  20      2
#> 2     1   male   2  20      3
#> 3     1   male   3  20     NA
#> 4     1   male   4  20      4
#> 5     1   male   5  20      2
#> 6     2   male   1  31      3
#> 7     2   male   2  31      3
#> 8     2   male   3  31     NA
#> 9     2   male   4  31      3
#> 10    2   male   5  31     NA
#> 11    3 female   1  27      1
#> 12    3 female   2  27      1
#> 13    3 female   3  27      2
#> 14    3 female   4  27      6
#> 15    3 female   5  27      4
#> 16    4 female   1  22      1
#> 17    4 female   2  22      2
#> 18    4 female   3  22      1
#> 19    4 female   4  22      3
#> 20    4 female   5  22      1

# aggregate the dataset
aggreg(dat, subj)
#>   subj    sex obs age stress
#> 1    1   male   3  20   2.75
#> 2    2   male   3  31   3.00
#> 3    3 female   3  27   2.80
#> 4    4 female   3  22   1.60

# aggregate the dataset for selected variables
aggreg(dat, subj, vars=c("subj","stress"))
#>   subj stress
#> 1    1   2.75
#> 2    2   3.00
#> 3    3   2.80
#> 4    4   1.60

# aggregate the dataset for selected variables
aggreg(dat, subj, vars=1:2)
#>   subj    sex
#> 1    1   male
#> 2    2   male
#> 3    3 female
#> 4    4 female