Suppose we want to load all the Eprime files in a directory and combine the results in dataframe.
My strategy in this scenario is to figure out what I need to do for a single file and then wrap those steps in a function that takes a filepath to a txt file and returns a dataframe. After some exploration and interactive programming, I come up with the following function.
library("plyr")
reduce_sails <- function(sails_path) {
sails_lines <- read_eprime(sails_path)
sails_frames <- FrameList(sails_lines)
# Trials occur at level 3
sails_frames <- keep_levels(sails_frames, 3)
sails <- to_data_frame(sails_frames)
# Tidy up
to_pick <- c("Eprime.Basename", "Running", "Module", "Sound",
"Sample", "Correct", "Response")
sails <- sails[to_pick]
running_map <- c(TrialLists = "Trial", PracticeBlock = "Practice")
sails$Running <- running_map[sails$Running]
# Renumber trials in the practice and experimental blocks separately.
# Numerically code correct response.
sails <- ddply(sails, .(Running), mutate,
TrialNumber = seq(from = 1, to = length(Running)),
CorrectResponse = ifelse(Correct == Response, 1, 0))
sails$Sample <- NULL
# Optionally, one might save the processed file via:
# csv <- paste0(file_path_sans_ext(sails_path), ".csv")
# write.csv(sails, csv, row.names = FALSE)
sails
}
Here’s a preview of what the function returns when given a filepath.
head(reduce_sails("data/SAILS/SAILS_001X00XS1.txt"))
#> Eprime.Basename Running Module Sound Correct Response TrialNumber
#> 1 SAILS_001X00XS1 Practice LAKE LAKE1.WAV Word Word 1
#> 2 SAILS_001X00XS1 Practice LAKE MAKE.WAV NotWord NotWord 2
#> 3 SAILS_001X00XS1 Practice LAKE LAKE1.WAV Word Word 3
#> 4 SAILS_001X00XS1 Practice LAKE LAKE1.WAV Word Word 4
#> 5 SAILS_001X00XS1 Practice LAKE MAKE.WAV NotWord NotWord 5
#> 6 SAILS_001X00XS1 Practice LAKE LAKE1.WAV Word Word 6
#> CorrectResponse
#> 1 1
#> 2 1
#> 3 1
#> 4 1
#> 5 1
#> 6 1
Now that the function works on one file, I can use ldply
to apply the function to several files, returning results in a single dataframe. (For dplyr
, I would lapply
the function to each path to get a list of dataframes, then use bind_rows
to combine into a single dataframe.)
sails_paths <- list.files("data/SAILS/", pattern = ".txt", full.names = TRUE)
sails_paths
#> [1] "data/SAILS/SAILS_001X00XS1.txt" "data/SAILS/SAILS_002X00XS1.txt"
ensemble <- ldply(sails_paths, reduce_sails)
Finally, with all of the subjects’ data contained in a single dataframe, I can use ddply
plus summarise
and compute summary scores at different levels of aggregation within each subject.
# Score trials within subjects
overall <- ddply(ensemble, .(Eprime.Basename, Running), summarise,
Score = sum(CorrectResponse),
PropCorrect = Score / length(CorrectResponse))
overall
#> Eprime.Basename Running Score PropCorrect
#> 1 SAILS_001X00XS1 Practice 10 1.0000000
#> 2 SAILS_001X00XS1 Trial 61 0.8714286
#> 3 SAILS_002X00XS1 Practice 9 0.9000000
#> 4 SAILS_002X00XS1 Trial 57 0.8142857
# Score modules within subjects
modules <- ddply(ensemble, .(Eprime.Basename, Running, Module), summarise,
Score = sum(CorrectResponse),
PropCorrect = mean(CorrectResponse))
modules
#> Eprime.Basename Running Module Score PropCorrect
#> 1 SAILS_001X00XS1 Practice LAKE 10 1.0000000
#> 2 SAILS_001X00XS1 Trial CAT 9 0.9000000
#> 3 SAILS_001X00XS1 Trial LAKE 10 1.0000000
#> 4 SAILS_001X00XS1 Trial RAT 16 0.8000000
#> 5 SAILS_001X00XS1 Trial SUE 26 0.8666667
#> 6 SAILS_002X00XS1 Practice LAKE 9 0.9000000
#> 7 SAILS_002X00XS1 Trial CAT 8 0.8000000
#> 8 SAILS_002X00XS1 Trial LAKE 10 1.0000000
#> 9 SAILS_002X00XS1 Trial RAT 11 0.5500000
#> 10 SAILS_002X00XS1 Trial SUE 28 0.9333333