Function that checks if a variable has no duplicated values for each subject.

check.nodup(x, id, data, out=1, na.rm=TRUE)

Arguments

x

argument to specify the variable to check.

id

argument to specify a subject id variable.

data

optional data frame that contains the variables specified above.

out

either a string or an integer (1 = "logical", 2 = "id", 3 = "data") indicating what information should be returned in case there are subjects where the variable has duplicates.

na.rm

logical indicating whether missing values should be removed before checking (default is TRUE).

Details

The function checks if a variable has no duplicated values for each subject.

When na.rm=TRUE (the default), missing values are ignored. When setting na.rm=FALSE, then missing values are treated as distinct values from any non-missing values. See ‘Examples’.

Value

When out = 1 or out = "logical", the function simply returns a logical (i.e., TRUE or FALSE), depending on whether the variable has no duplicated values within each subject.

When out = 2 or out = "id", the function returns a vector with the ids of those subjects where the variable has duplicated values.

When out = 3 or out = "data", the function returns the data for those subjects where the variable has duplicated values.

Author

Wolfgang Viechtbauer wvb@wvbauer.com

Examples

# illustrative dataset
dat <- data.frame(subj=rep(1:4, each=5),
                  obs = 1:5,
                  age = rep(c(20,31,27,22), each=5),
                  stress = c(2,3,1,4,2, 3,3,3,3,3, 1,1,2,6,4, 1,2,1,3,1))
dat
#>    subj obs age stress
#> 1     1   1  20      2
#> 2     1   2  20      3
#> 3     1   3  20      1
#> 4     1   4  20      4
#> 5     1   5  20      2
#> 6     2   1  31      3
#> 7     2   2  31      3
#> 8     2   3  31      3
#> 9     2   4  31      3
#> 10    2   5  31      3
#> 11    3   1  27      1
#> 12    3   2  27      1
#> 13    3   3  27      2
#> 14    3   4  27      6
#> 15    3   5  27      4
#> 16    4   1  22      1
#> 17    4   2  22      2
#> 18    4   3  22      1
#> 19    4   4  22      3
#> 20    4   5  22      1

# check that variable obs has no duplicated values within subjects
check.nodup(obs, subj, data=dat)
#> [1] TRUE

# introduce a duplicated value for the third subject
dat$obs[13] <- 2

# check that variable obs has no duplicated values within subjects
check.nodup(obs, subj, data=dat)
#> [1] FALSE

# for which subjects are there duplicated values?
check.nodup(obs, subj, data=dat, out=2)
#> [1] "3"

# show the data for those subjects
check.nodup(obs, subj, data=dat, out=3)
#>    subj obs age stress
#> 11    3   1  27      1
#> 12    3   2  27      1
#> 13    3   2  27      2
#> 14    3   4  27      6
#> 15    3   5  27      4