Synopsis

The aim of this document is to provide all the analyses for Experiment 1 Overlap all 3. In this experiment, participants were presented with displays of 4 words, among which were the target, a high phonological competitor, a low phonological competitor, and a distractor. The structure of the document will be as follows:

  1. Data Cleaning and Pre-processing

  2. Exploratory Analyses

  3. Growth Curve Analysis

  4. Graphs

Libraries

Load the required libraries

library(ggplot2)
library(lme4)
library(lmerTest)
library(dplyr)
library(parallel)
library(doParallel)
library(tidyr)
library(gridExtra)
library(caret)

Data Loading

First, I load the data file indicating the trials that were removed for each subject.

deletedTrials<-read.csv("delete.csv")

Now I load up the csv files with Exp1 data. There are two files (Exp1Part1 and Exp1Part2). Not all colums in those files are required. Only the following colums will be loaded in R, so as to preserve memory space. The colums are: RECORDING_SESSION_LABEL, IA_FIRST_RUN_START_TIME, IA_FIRST_RUN_START_TIME, IA_FIRST_RUN_END_TIME, IA_SECOND_RUN_START_TIME, IA_SECOND_RUN_END_TIME, IA_THIRD_RUN_START_TIME, IA_THIRD_RUN_END_TIME, IA_LABEL, RESPONSE_ACC, IA_DWELL_TIME, TRIAL_INDEX, target, RESPONSE_RT,trialtype,target, list, oldnewitem

Part1 <- read.csv("Exp1Part1.csv", na.strings = c("."))[,c('RECORDING_SESSION_LABEL', 'IA_FIRST_RUN_START_TIME', 'IA_FIRST_RUN_START_TIME', 'IA_FIRST_RUN_END_TIME', 'IA_SECOND_RUN_START_TIME', 'IA_SECOND_RUN_END_TIME', 'IA_THIRD_RUN_START_TIME', 'IA_THIRD_RUN_END_TIME', 'IA_LABEL', 'RESPONSE_ACC', 'IA_DWELL_TIME', 'TRIAL_INDEX', 'target', 'RESPONSE_RT', 'trialtype','oldnewitem', 'MOUSE_TIME','DISPLAY_TIME')]

#now load up part 2
Part2 <- read.csv("Exp1Part2.csv", na.strings = c("."))[,c('RECORDING_SESSION_LABEL', 'IA_FIRST_RUN_START_TIME', 'IA_FIRST_RUN_START_TIME', 'IA_FIRST_RUN_END_TIME', 'IA_SECOND_RUN_START_TIME', 'IA_SECOND_RUN_END_TIME', 'IA_THIRD_RUN_START_TIME', 'IA_THIRD_RUN_END_TIME', 'IA_LABEL', 'RESPONSE_ACC', 'IA_DWELL_TIME', 'TRIAL_INDEX', 'target', 'RESPONSE_RT', 'trialtype','oldnewitem', 'MOUSE_TIME','DISPLAY_TIME')]

#combine part 1 and part 2 into one dataframe
EyeData<-rbind(Part1, Part2)

Data Cleaning and Pre-Processing

Right now we have a dataset with all the subjects that took part in Experiment 1. In the code below, I perform some preprocessing that aims to do the following: Remove bad trials (e.g., trials with a lot of off-screen fixations or trials in which the fixations don’t start at the center of the screen or trials in which the tracking of the eye was not very good. The decision about which trials should be removed was done by two graduate students, Stas and Julie.) The deleted trials can be found in the deletedTrials file.

#this part of the code is not very effective; need to figure out later how to optimize it.
EyeData<-EyeData[!c(EyeData$RECORDING_SESSION_LABEL=="1sse1" & EyeData$TRIAL_INDEX==50),]
EyeData<-EyeData[!c(EyeData$RECORDING_SESSION_LABEL=="12spe1" & EyeData$TRIAL_INDEX==73),]
EyeData<-EyeData[!c(EyeData$RECORDING_SESSION_LABEL=="7spe1" & EyeData$TRIAL_INDEX==38),]
EyeData<-EyeData[!c(EyeData$RECORDING_SESSION_LABEL=="4tle1" & EyeData$TRIAL_INDEX==53),]
EyeData<-EyeData[!c(EyeData$RECORDING_SESSION_LABEL=="9spe1" & EyeData$TRIAL_INDEX==87),]
EyeData<-EyeData[!c(EyeData$RECORDING_SESSION_LABEL=="11spe2" & EyeData$TRIAL_INDEX==11),]
24/10880
## [1] 0.002205882

Only 6 trials have been deleted, which constitutes less than .22% of the total number of trials.

In the code below, I transform the NAs to zero, calculate the RT from the onset of the word display till the mouse click time is made.

#change all the NAs to zeros
EyeData[is.na(EyeData)] <- 0
#Calculate RT from the display onset until the mouse click is made
EyeData$RESPONSE_RT<-EyeData$MOUSE_TIME-EyeData$DISPLAY_TIME
#check the dimentions of the dataset
dim(EyeData)
## [1] 10880    18
#remove the MOUSE_TIME and DISPLAY_TIME variables
EyeData<-EyeData %>% select (1:16)

#rename the RECORDING_SESSION_LABEL into Subject
colnames(EyeData)[1] <- "Subject"
#examine colum names
names(EyeData)
##  [1] "Subject"                   "IA_FIRST_RUN_START_TIME"  
##  [3] "IA_FIRST_RUN_START_TIME.1" "IA_FIRST_RUN_END_TIME"    
##  [5] "IA_SECOND_RUN_START_TIME"  "IA_SECOND_RUN_END_TIME"   
##  [7] "IA_THIRD_RUN_START_TIME"   "IA_THIRD_RUN_END_TIME"    
##  [9] "IA_LABEL"                  "RESPONSE_ACC"             
## [11] "IA_DWELL_TIME"             "TRIAL_INDEX"              
## [13] "target"                    "RESPONSE_RT"              
## [15] "trialtype"                 "oldnewitem"

Examine the structure of the dataframe

str(EyeData)
## 'data.frame':    10880 obs. of  16 variables:
##  $ Subject                  : Factor w/ 29 levels "10spe1","11spe2",..: 16 16 16 16 16 16 16 16 16 16 ...
##  $ IA_FIRST_RUN_START_TIME  : num  2788 0 2350 2548 2126 ...
##  $ IA_FIRST_RUN_START_TIME.1: num  2788 0 2350 2548 2126 ...
##  $ IA_FIRST_RUN_END_TIME    : num  3912 0 2495 2726 3096 ...
##  $ IA_SECOND_RUN_START_TIME : num  0 0 0 0 0 ...
##  $ IA_SECOND_RUN_END_TIME   : num  0 0 0 0 0 ...
##  $ IA_THIRD_RUN_START_TIME  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ IA_THIRD_RUN_END_TIME    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ IA_LABEL                 : Factor w/ 9 levels "CompetitorHigh ",..: 3 1 2 4 3 1 2 4 3 1 ...
##  $ RESPONSE_ACC             : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ IA_DWELL_TIME            : int  1115 0 146 179 971 0 0 0 614 281 ...
##  $ TRIAL_INDEX              : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ target                   : Factor w/ 94 levels "bald","bang",..: 39 39 39 39 31 31 31 31 16 16 ...
##  $ RESPONSE_RT              : num  2204 2204 2204 2204 1380 ...
##  $ trialtype                : Factor w/ 4 levels "Exp","FillerRel",..: 4 4 4 4 4 4 4 4 4 4 ...
##  $ oldnewitem               : Factor w/ 3 levels "New","Old","x": 3 3 3 3 3 3 3 3 3 3 ...

Subset the experimental trials. This will remove all other trialtypes, such as FillerRel, FillerUnrel and Practice trials

EyeData <- EyeData %>% filter(trialtype=="Exp")

Check subject accuracy

SubjectAccuracy<- EyeData %>% group_by(Subject) %>%
    summarise(MeanAccuracy=mean(RESPONSE_ACC)) %>% 
    arrange(MeanAccuracy)
SubjectAccuracy
## Source: local data frame [29 x 2]
## 
##    Subject MeanAccuracy
##     (fctr)        (dbl)
## 1    1spe1    0.9666667
## 2    1tle1    0.9666667
## 3    2sre1    0.9666667
## 4    4sre1    0.9666667
## 5    8spe1    0.9666667
## 6   10spe1    1.0000000
## 7   11spe2    1.0000000
## 8    1sse1    1.0000000
## 9    2spe1    1.0000000
## 10   2sse1    1.0000000
## ..     ...          ...
mean(SubjectAccuracy$MeanAccuracy)
## [1] 0.9942529

Subject accuracy is very high. Allmost all the subjects have 100% accuracy. Only 5 participants have a few incorrect trials.

Remove trials with incorrect accuracy

CorrectEyeData<-EyeData %>% filter(RESPONSE_ACC==1)
#number of rows deleted
nrow(EyeData)-nrow(CorrectEyeData) #20 rows get deleted, which corresponds to 5 incorrect trials
## [1] 20
20/3468 #thats about .5767% of experimental trials
## [1] 0.005767013

Set up the start time and the end time dummy-coding of each run in the interest area. This creates colums where if a fixation exists during a run, then the run gets 1 (exists) or if there is no fixation in the IA during a run, then the coding is 0 (not exist)

CorrectEyeData$Fststart <- ifelse(as.numeric(as.character(CorrectEyeData$IA_FIRST_RUN_START_TIME)) > 0, as.numeric(as.character(CorrectEyeData$IA_FIRST_RUN_START_TIME)), 0) 
CorrectEyeData$Fstend <- ifelse(as.numeric(as.character(CorrectEyeData$IA_FIRST_RUN_START_TIME)) > 0, as.numeric(as.character(CorrectEyeData$IA_FIRST_RUN_END_TIME)), 0)
CorrectEyeData$Secstart <- ifelse(as.numeric(as.character(CorrectEyeData$IA_SECOND_RUN_START_TIME)) > 0, as.numeric(as.character(CorrectEyeData$IA_SECOND_RUN_START_TIME)), 0)
CorrectEyeData$Secend <- ifelse(as.numeric(as.character(CorrectEyeData$IA_SECOND_RUN_START_TIME)) > 0, as.numeric(as.character(CorrectEyeData$IA_SECOND_RUN_END_TIME)), 0)
CorrectEyeData$Thirdstart <- ifelse(as.numeric(as.character(CorrectEyeData$IA_THIRD_RUN_START_TIME)) > 0, as.numeric(as.character(CorrectEyeData$IA_THIRD_RUN_START_TIME)), 0)
CorrectEyeData$Thirdend <- ifelse(as.numeric(as.character(CorrectEyeData$IA_THIRD_RUN_START_TIME)) > 0, as.numeric(as.character(CorrectEyeData$IA_THIRD_RUN_END_TIME)), 0)

Generate time bins from time 0 to time 6000 in 25 ms bins and assign them to the dataset

time <- seq(0, 6000, by=25)
tmatrix <- matrix(nrow=nrow(CorrectEyeData), ncol=length(time))
dim(tmatrix)
## [1] 3448  241

Generate time vectors for each row and column for first, second, and third pass viewings so that viewing periods receive a viewing probability value of 1

registerDoParallel(3)
for(i in 1:nrow(tmatrix)) {
for(j in 1:length(time)) {

tmatrix[i,j] <-  ifelse(CorrectEyeData$Fststart[i] < time[j] & 
                CorrectEyeData$Fstend[i] > time[j] |CorrectEyeData$Secstart[i] <
                time[j] & CorrectEyeData$Secend[i] > time[j] | CorrectEyeData$Thirdstart[i] 
                < time[j] & CorrectEyeData$Thirdend[i]>time[j], 1,0)
} 
}

Combine the CleanEyeData with the time matrix

CleanData <- cbind(CorrectEyeData, data.frame(tmatrix))

Assign time values to time bin columns

colnames(CleanData)[23:263] <- seq(0, 6000, by=25)

Subset the dataset with only the necessary colums and remove anything in the memory that might be a memory hog

CleanData <- CleanData[, -c(2:8,10,11,12,13,14, 17:22)]

Put the data in long-format and then calculate the proportion for each time bin for each subject

CleanData<- CleanData %>% gather(time,value,5:245)
#find proportion for each interest area for each subject
CleanDataProb<-CleanData %>% group_by(Subject,IA_LABEL,time) %>%
    summarise(Prob=mean(value))
CleanDataProb$time<-as.numeric(as.character(CleanDataProb$time))

Exploratory Graphs

The aim of this section is to explore pattern in the data without commiting to any formal modeling.

which Competitor is the stronger one?

#subset the required dataset by averaging over subjects; find the usual
OnlySubjects<-CleanData %>%
    group_by(IA_LABEL, time) %>%
    summarise(n=n(),Prob=mean(value),sd=sd(value)) %>%
    mutate(se=sd/sqrt(n), LCI=Prob+qnorm(0.025)*se, UCI=Prob+qnorm(0.975)*se)
OnlySubjects$time<-as.numeric(as.character(OnlySubjects$time))

levels(OnlySubjects$IA_LABEL) <- list(Target="TARGET_IA ", `High Competitor`="CompetitorHigh ", `Low Competitor`="CompetitorLow ", `Distractor`="UNREL2_IA ")

colnames(OnlySubjects)[1]<-"Interest Area"
#plot for time 0 to time 6000
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
g1<-ggplot(OnlySubjects, aes(x=time, y=Prob, color=`Interest Area`, shape=`Interest Area`)) +
    geom_point(size=3) +
    geom_errorbar(aes(ymin=LCI, ymax=UCI)) +
    scale_x_continuous(breaks=seq(0,6000,500), limits=c(0,6000), name="time from trial onset (ms)")+
    geom_vline(xintercept = 1825)+
    geom_vline(xintercept = 3425)+theme_bw()+
    scale_fill_manual(values=cbPalette)+scale_colour_manual(values=cbPalette)+
    scale_y_continuous(name="Fixation Proportion")+
    theme(legend.text = element_text(size = 12))+
    theme(legend.title = element_text(size=12, face="bold"))+
    theme(legend.title = element_text(size=12, face="bold"))+
    theme(axis.title.x = element_text(face="bold", size=15), axis.text.x  = element_text(size=10))+
    theme(axis.title.y = element_text(face="bold", size=15), axis.text.y  = element_text(size=10))+
    theme(strip.text.x = element_text(size=12, face="bold"))+
    theme(plot.title = element_text(lineheight=.8, face="bold", size=14))+theme_bw()

g1

ggsave(
  "Plot1Exp1.png",
  g1,
  width = 9.25,
  height = 6.25,
  dpi = 300
)

    
OnlySubjects$time2<-OnlySubjects$time-1825

g2<-ggplot(OnlySubjects, aes(x=time2, y=Prob, color=`Interest Area`, shape=`Interest Area`)) +
    geom_point(size=3) +
    geom_errorbar(aes(ymin=LCI, ymax=UCI))+
    scale_x_continuous(breaks=seq(0,1500,50), limits=c(0,1500), name="time from spoken target onset(ms)")+
    theme_bw()+
    scale_fill_manual(values=cbPalette)+scale_colour_manual(values=cbPalette)+
    scale_y_continuous(name="Fixation Proportion")+
    theme(legend.text = element_text(size = 12))+
    theme(legend.title = element_text(size=12, face="bold"))+
    theme(legend.title = element_text(size=12, face="bold"))+
    theme(axis.title.x = element_text(face="bold", size=15), axis.text.x  = element_text(size=12, angle=90, vjust=0.5))+
    theme(axis.title.y = element_text(face="bold", size=15), axis.text.y  = element_text(size=12))+
    theme(plot.title = element_text(lineheight=.8, face="bold", size=14))+
    theme(strip.text.x = element_text(size=12, face="bold"))

g2

#the plots shows some warnings because it removes values before display onset and after display onset

ggsave(
  "Plot2Exp1.png",
  g2,
  width = 9.25,
  height = 6.25,
  dpi = 300
)

Growth Curve Analyses

The following analyses will examine 1. the proportion fixations between the High and Low Competitor

High vs. Low Competitor

First, the required subset is created:

model<-CleanDataProb %>% 
    filter(IA_LABEL=="CompetitorHigh " | IA_LABEL=="CompetitorLow ", time>=2025 & time <=3025)
model$IA_LABEL<-as.factor(as.character(model$IA_LABEL))

#Generate time polynomials up to quadratic poly.
t25 <- data.frame(poly(unique(model$time),4))
t25$time <- seq(2025, 3025, by=25)

#add polynomials to data frame
model <- merge(model , t25, by="time")
head(model)
##   time Subject        IA_LABEL       Prob         X1        X2         X3
## 1 2025  10spe1 CompetitorHigh  0.10000000 -0.2639818 0.3245611 -0.3568831
## 2 2025  10spe1  CompetitorLow  0.03333333 -0.2639818 0.3245611 -0.3568831
## 3 2025  11spe2  CompetitorLow  0.10344828 -0.2639818 0.3245611 -0.3568831
## 4 2025   1spe1 CompetitorHigh  0.03448276 -0.2639818 0.3245611 -0.3568831
## 5 2025   1sse1 CompetitorHigh  0.30000000 -0.2639818 0.3245611 -0.3568831
## 6 2025   1tle1  CompetitorLow  0.06896552 -0.2639818 0.3245611 -0.3568831
##          X4
## 1 0.3669381
## 2 0.3669381
## 3 0.3669381
## 4 0.3669381
## 5 0.3669381
## 6 0.3669381
str(model)
## 'data.frame':    2378 obs. of  8 variables:
##  $ time    : num  2025 2025 2025 2025 2025 ...
##  $ Subject : Factor w/ 29 levels "10spe1","11spe2",..: 1 1 2 3 4 5 6 7 7 8 ...
##  $ IA_LABEL: Factor w/ 2 levels "CompetitorHigh ",..: 1 2 2 1 1 2 1 1 2 2 ...
##  $ Prob    : num  0.1 0.0333 0.1034 0.0345 0.3 ...
##  $ X1      : num  -0.264 -0.264 -0.264 -0.264 -0.264 ...
##  $ X2      : num  0.325 0.325 0.325 0.325 0.325 ...
##  $ X3      : num  -0.357 -0.357 -0.357 -0.357 -0.357 ...
##  $ X4      : num  0.367 0.367 0.367 0.367 0.367 ...

The following models are run, with increasing complexity:

registerDoParallel(3)
#base model; Model1
Model1<-lmer(Prob ~ IA_LABEL + 
                 (1| Subject) + 
                 (1| Subject:IA_LABEL),
               data=model, REML=T)
#linear; model 2
Model2<-lmer(Prob ~ IA_LABEL*X1 + 
                 (1+X1| Subject) + 
                 (1+X1| Subject:IA_LABEL),
               data=model, REML=T)
#linea+quadratic; model 3
Model3<-lmer(Prob ~ IA_LABEL*(X1+X2) + 
                 (1+X1+X2| Subject) + 
                 (1+X1+X2| Subject:IA_LABEL),
               data=model, REML=T)
#Model 4
Model4<-lmer(Prob ~ IA_LABEL*(X1+X2+X3) + 
                 (1+X1+X2+X3| Subject) + 
                 (1+X1+X2+X3| Subject:IA_LABEL),
               data=model, REML=T)
#Model 5 failed to converge
Model5<-lmer(Prob ~ IA_LABEL*(X1+X2+X3+X4) + 
                 (1+X1+X2+X3+X4| Subject) + 
                 (1+X1+X2+X3+X4| Subject:IA_LABEL),
               data=model, REML=T)
#Model 6 converged
Model6<-lmer(Prob ~ IA_LABEL*(X1+X2+X3+X4) + 
                 (1+X1+X2+X3| Subject) + 
                 (1+X1+X2+X3+X4| Subject:IA_LABEL),
               data=model, REML=T)
#Model 7 converged
Model7<-lmer(Prob ~ IA_LABEL*(X1+X2+X3+X4) + 
                 (1+X1+X2+X3+X4| Subject) + 
                 (1+X1+X2+X3| Subject:IA_LABEL),
               data=model, REML=T)

library(beepr)
beep(8)

#compare the models based on ANOVA test
anova(Model1,Model2,Model3,Model4,Model5,Model6,Model7)
## Data: model
## Models:
## object: Prob ~ IA_LABEL + (1 | Subject) + (1 | Subject:IA_LABEL)
## ..1: Prob ~ IA_LABEL * X1 + (1 + X1 | Subject) + (1 + X1 | Subject:IA_LABEL)
## ..2: Prob ~ IA_LABEL * (X1 + X2) + (1 + X1 + X2 | Subject) + (1 + 
## ..2:     X1 + X2 | Subject:IA_LABEL)
## ..3: Prob ~ IA_LABEL * (X1 + X2 + X3) + (1 + X1 + X2 + X3 | Subject) + 
## ..3:     (1 + X1 + X2 + X3 | Subject:IA_LABEL)
## ..5: Prob ~ IA_LABEL * (X1 + X2 + X3 + X4) + (1 + X1 + X2 + X3 | Subject) + 
## ..5:     (1 + X1 + X2 + X3 + X4 | Subject:IA_LABEL)
## ..6: Prob ~ IA_LABEL * (X1 + X2 + X3 + X4) + (1 + X1 + X2 + X3 + X4 | 
## ..6:     Subject) + (1 + X1 + X2 + X3 | Subject:IA_LABEL)
## ..4: Prob ~ IA_LABEL * (X1 + X2 + X3 + X4) + (1 + X1 + X2 + X3 + X4 | 
## ..4:     Subject) + (1 + X1 + X2 + X3 + X4 | Subject:IA_LABEL)
##        Df     AIC     BIC logLik deviance   Chisq Chi Df Pr(>Chisq)    
## object  5 -4870.7 -4841.8 2440.4  -4880.7                              
## ..1    11 -5932.2 -5868.6 2977.1  -5954.2 1073.44      6     <2e-16 ***
## ..2    19 -7157.9 -7048.2 3598.0  -7195.9 1241.77      8     <2e-16 ***
## ..3    29 -7823.7 -7656.3 3940.9  -7881.7  685.80     10     <2e-16 ***
## ..5    36 -8215.0 -8007.2 4143.5  -8287.0  405.33      7     <2e-16 ***
## ..6    36 -8083.3 -7875.4 4077.6  -8155.3    0.00      0          1    
## ..4    41 -8209.8 -7973.1 4145.9  -8291.8  136.51      5     <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#compare the models based on R^2 based on a paper from Xu (http://onlinelibrary.wiley.com/doi/10.1002/sim.1572/abstract)
1-var(residuals(Model1))/(var(model.response(model.frame(Model1))))
## [1] 0.09853359
1-var(residuals(Model2))/(var(model.response(model.frame(Model2))))
## [1] 0.4825567
1-var(residuals(Model3))/(var(model.response(model.frame(Model3))))
## [1] 0.7260332
1-var(residuals(Model4))/(var(model.response(model.frame(Model4))))
## [1] 0.8161727
1-var(residuals(Model5))/(var(model.response(model.frame(Model5))))
## [1] 0.8589269
1-var(residuals(Model6))/(var(model.response(model.frame(Model6))))
## [1] 0.858976
1-var(residuals(Model7))/(var(model.response(model.frame(Model7))))
## [1] 0.8443338
#find the summary for model 4
summary(Model4)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
##   to degrees of freedom [lmerMod]
## Formula: 
## Prob ~ IA_LABEL * (X1 + X2 + X3) + (1 + X1 + X2 + X3 | Subject) +  
##     (1 + X1 + X2 + X3 | Subject:IA_LABEL)
##    Data: model
## 
## REML criterion at convergence: -7834
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.4260 -0.6105 -0.0173  0.6013  3.1867 
## 
## Random effects:
##  Groups           Name        Variance  Std.Dev. Corr             
##  Subject:IA_LABEL (Intercept) 0.0004802 0.02191                   
##                   X1          0.0365941 0.19130   0.23            
##                   X2          0.0348028 0.18655  -0.51 -0.39      
##                   X3          0.0196494 0.14018   0.03 -0.37 -0.12
##  Subject          (Intercept) 0.0002257 0.01502                   
##                   X1          0.0348023 0.18655  -0.17            
##                   X2          0.0067362 0.08207   0.15 -1.00      
##                   X3          0.0000819 0.00905  -0.23 -0.92  0.93
##  Residual                     0.0016023 0.04003                   
## Number of obs: 2378, groups:  Subject:IA_LABEL, 58; Subject, 29
## 
## Fixed effects:
##                            Estimate Std. Error        df t value Pr(>|t|)
## (Intercept)                0.148103   0.005068 51.290000  29.221  < 2e-16
## IA_LABELCompetitorLow     -0.020558   0.005984 29.710000  -3.435  0.00177
## X1                        -0.256111   0.050172 45.610000  -5.105 6.30e-06
## X2                        -0.252285   0.038570 54.650000  -6.541 2.18e-08
## X3                         0.123652   0.027123 56.000000   4.559 2.84e-05
## IA_LABELCompetitorLow :X1  0.053181   0.051325 29.630000   1.036  0.30851
## IA_LABELCompetitorLow :X2  0.150451   0.050107 44.550000   3.003  0.00438
## IA_LABELCompetitorLow :X3 -0.081067   0.038284 55.810000  -2.118  0.03868
##                              
## (Intercept)               ***
## IA_LABELCompetitorLow     ** 
## X1                        ***
## X2                        ***
## X3                        ***
## IA_LABELCompetitorLow :X1    
## IA_LABELCompetitorLow :X2 ** 
## IA_LABELCompetitorLow :X3 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##               (Intr) IA_LAB X1     X2     X3     IA_LABELCL:X1
## IA_LABELCmL   -0.590                                          
## X1             0.068 -0.113                                   
## X2            -0.337  0.313 -0.519                            
## X3             0.014 -0.018 -0.291 -0.079                     
## IA_LABELCL:X1 -0.130  0.221 -0.511  0.241  0.246              
## IA_LABELCL:X2  0.284 -0.482  0.189 -0.650  0.078 -0.370       
## IA_LABELCL:X3 -0.015  0.026  0.179  0.072 -0.706 -0.349       
##               IA_LABELCL:X2
## IA_LABELCmL                
## X1                         
## X2                         
## X3                         
## IA_LABELCL:X1              
## IA_LABELCL:X2              
## IA_LABELCL:X3 -0.111
#find summary for model 7
summary(Model7)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
##   to degrees of freedom [lmerMod]
## Formula: Prob ~ IA_LABEL * (X1 + X2 + X3 + X4) + (1 + X1 + X2 + X3 + X4 |  
##     Subject) + (1 + X1 + X2 + X3 | Subject:IA_LABEL)
##    Data: model
## 
## REML criterion at convergence: -8093.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7107 -0.5966  0.0021  0.6058  3.3543 
## 
## Random effects:
##  Groups           Name        Variance  Std.Dev. Corr                   
##  Subject:IA_LABEL (Intercept) 0.0004744 0.02178                         
##                   X1          0.0367525 0.19171   0.22                  
##                   X2          0.0316077 0.17779  -0.49 -0.37            
##                   X3          0.0198548 0.14091   0.03 -0.37 -0.13      
##  Subject          (Intercept) 0.0002370 0.01540                         
##                   X1          0.0348712 0.18674  -0.15                  
##                   X2          0.0101589 0.10079  -0.01 -0.87            
##                   X3          0.0001041 0.01020  -0.31 -0.76  0.95      
##                   X4          0.0065839 0.08114   0.15  0.58 -0.91 -0.92
##  Residual                     0.0013747 0.03708                         
## Number of obs: 2378, groups:  Subject:IA_LABEL, 58; Subject, 29
## 
## Fixed effects:
##                             Estimate Std. Error         df t value
## (Intercept)                1.481e-01  5.068e-03  5.090e+01  29.221
## IA_LABELCompetitorLow     -2.056e-02  5.919e-03  3.110e+01  -3.473
## X1                        -2.561e-01  5.017e-02  4.560e+01  -5.105
## X2                        -2.523e-01  3.857e-02  5.310e+01  -6.541
## X3                         1.237e-01  2.712e-02  5.600e+01   4.559
## X4                         6.378e-02  1.657e-02  3.350e+01   3.850
## IA_LABELCompetitorLow :X1  5.318e-02  5.128e-02  3.010e+01   1.037
## IA_LABELCompetitorLow :X2  1.505e-01  4.769e-02  5.110e+01   3.155
## IA_LABELCompetitorLow :X3 -8.107e-02  3.826e-02  5.590e+01  -2.119
## IA_LABELCompetitorLow :X4 -5.130e-02  9.737e-03  2.116e+03  -5.269
##                           Pr(>|t|)    
## (Intercept)                < 2e-16 ***
## IA_LABELCompetitorLow     0.001534 ** 
## X1                        6.30e-06 ***
## X2                        2.46e-08 ***
## X3                        2.84e-05 ***
## X4                        0.000504 ***
## IA_LABELCompetitorLow :X1 0.307937    
## IA_LABELCompetitorLow :X2 0.002692 ** 
## IA_LABELCompetitorLow :X3 0.038575 *  
## IA_LABELCompetitorLow :X4 1.51e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##               (Intr) IA_LAB X1     X2     X3     X4     IA_LABELCL:X1
## IA_LABELCmL   -0.584                                                 
## X1             0.068 -0.107                                          
## X2            -0.337  0.287 -0.519                                   
## X3             0.014 -0.022 -0.291 -0.079                            
## X4             0.075  0.000  0.364 -0.400 -0.059                     
## IA_LABELCL:X1 -0.122  0.210 -0.511  0.222  0.249  0.000              
## IA_LABELCL:X2  0.271 -0.463  0.184 -0.618  0.090  0.000 -0.359       
## IA_LABELCL:X3 -0.018  0.032  0.180  0.079 -0.705  0.000 -0.353       
## IA_LABELCL:X4  0.000  0.000  0.000  0.000  0.000 -0.294  0.000       
##               IA_LABELCL:X2 IA_LABELCL:X3
## IA_LABELCmL                              
## X1                                       
## X2                                       
## X3                                       
## X4                                       
## IA_LABELCL:X1                            
## IA_LABELCL:X2                            
## IA_LABELCL:X3 -0.127                     
## IA_LABELCL:X4  0.000         0.000
#find summary for model 5
summary(Model6)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
##   to degrees of freedom [lmerMod]
## Formula: 
## Prob ~ IA_LABEL * (X1 + X2 + X3 + X4) + (1 + X1 + X2 + X3 | Subject) +  
##     (1 + X1 + X2 + X3 + X4 | Subject:IA_LABEL)
##    Data: model
## 
## REML criterion at convergence: -8226.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8755 -0.5778 -0.0134  0.5740  3.4955 
## 
## Random effects:
##  Groups           Name        Variance  Std.Dev. Corr                   
##  Subject:IA_LABEL (Intercept) 4.869e-04 0.022065                        
##                   X1          4.245e-02 0.206030  0.25                  
##                   X2          3.817e-02 0.195379 -0.52 -0.46            
##                   X3          2.005e-02 0.141590  0.02 -0.36 -0.10      
##                   X4          1.099e-02 0.104841  0.07  0.33 -0.56 -0.03
##  Subject          (Intercept) 2.269e-04 0.015064                        
##                   X1          2.637e-02 0.162400 -0.27                  
##                   X2          2.432e-03 0.049316  0.48 -0.97            
##                   X3          2.071e-05 0.004551  0.15 -0.99  0.94      
##  Residual                     1.260e-03 0.035491                        
## Number of obs: 2378, groups:  Subject:IA_LABEL, 58; Subject, 29
## 
## Fixed effects:
##                            Estimate Std. Error        df t value Pr(>|t|)
## (Intercept)                0.148103   0.005067 51.300000  29.230  < 2e-16
## IA_LABELCompetitorLow     -0.020558   0.005975 31.040000  -3.441  0.00168
## X1                        -0.256111   0.049159 52.000000  -5.210 3.29e-06
## X2                        -0.252285   0.037995 57.800000  -6.640 1.19e-08
## X3                         0.123652   0.027119 56.030000   4.560 2.83e-05
## X4                         0.063784   0.020554 56.000000   3.103  0.00300
## IA_LABELCompetitorLow :X1  0.053181   0.054903 25.260000   0.969  0.34191
## IA_LABELCompetitorLow :X2  0.150451   0.052149 48.110000   2.885  0.00584
## IA_LABELCompetitorLow :X3 -0.081067   0.038334 55.940000  -2.115  0.03892
## IA_LABELCompetitorLow :X4 -0.051300   0.029067 56.000000  -1.765  0.08304
##                              
## (Intercept)               ***
## IA_LABELCompetitorLow     ** 
## X1                        ***
## X2                        ***
## X3                        ***
## X4                        ** 
## IA_LABELCompetitorLow :X1    
## IA_LABELCompetitorLow :X2 ** 
## IA_LABELCompetitorLow :X3 *  
## IA_LABELCompetitorLow :X4 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##               (Intr) IA_LAB X1     X2     X3     X4     IA_LABELCL:X1
## IA_LABELCmL   -0.590                                                 
## X1             0.064 -0.133                                          
## X2            -0.337  0.340 -0.487                                   
## X3             0.014 -0.010 -0.294 -0.083                            
## X4             0.053 -0.045  0.241 -0.502 -0.029                     
## IA_LABELCL:X1 -0.140  0.237 -0.558  0.307  0.246 -0.215              
## IA_LABELCL:X2  0.293 -0.496  0.250 -0.686  0.066  0.366 -0.447       
## IA_LABELCL:X3 -0.008  0.014  0.194  0.064 -0.707  0.021 -0.348       
## IA_LABELCL:X4 -0.037  0.063 -0.170  0.355  0.021 -0.707  0.305       
##               IA_LABELCL:X2 IA_LABELCL:X3
## IA_LABELCmL                              
## X1                                       
## X2                                       
## X3                                       
## X4                                       
## IA_LABELCL:X1                            
## IA_LABELCL:X2                            
## IA_LABELCL:X3 -0.093                     
## IA_LABELCL:X4 -0.518        -0.029
#find the RMSE
sqrt(mean((model$Prob-fitted(Model4))^2))
## [1] 0.03816047
#attach the fitted values to the dataset model, so that they can be used later for graphing purposes
model$Model1Fitted<-fitted(Model1)
model$Model2Fitted<-fitted(Model2)
model$Model3Fitted<-fitted(Model3)
model$Model4Fitted<-fitted(Model4)
model$Model5Fitted<-fitted(Model5)
model$Model6Fitted<-fitted(Model6)
model$Model7Fitted<-fitted(Model7)

#check that the values are in the dataset
str(model)
## 'data.frame':    2378 obs. of  15 variables:
##  $ time        : num  2025 2025 2025 2025 2025 ...
##  $ Subject     : Factor w/ 29 levels "10spe1","11spe2",..: 1 1 2 3 4 5 6 7 7 8 ...
##  $ IA_LABEL    : Factor w/ 2 levels "CompetitorHigh ",..: 1 2 2 1 1 2 1 1 2 2 ...
##  $ Prob        : num  0.1 0.0333 0.1034 0.0345 0.3 ...
##  $ X1          : num  -0.264 -0.264 -0.264 -0.264 -0.264 ...
##  $ X2          : num  0.325 0.325 0.325 0.325 0.325 ...
##  $ X3          : num  -0.357 -0.357 -0.357 -0.357 -0.357 ...
##  $ X4          : num  0.367 0.367 0.367 0.367 0.367 ...
##  $ Model1Fitted: num  0.158 0.164 0.139 0.123 0.131 ...
##  $ Model2Fitted: num  0.186 0.179 0.208 0.187 0.231 ...
##  $ Model3Fitted: num  0.0575 0.0211 0.1535 0.0718 0.2844 ...
##  $ Model4Fitted: num  0.0316 -0.0333 0.131 -0.0419 0.2882 ...
##  $ Model5Fitted: num  0.10239 0.02159 0.12446 0.00349 0.29014 ...
##  $ Model6Fitted: num  0.10156 0.02126 0.12398 0.00349 0.291 ...
##  $ Model7Fitted: num  0.1034 0.0194 0.1276 -0.0133 0.3098 ...

Model 4 performs the best, though LMER is not very susceptible to corss-validation, so the addition of more polynomials might be leading to overfitting

Graphs

Each model used above will be graphed by overlaying it on the real data. First, the plots for the first comparison will be created. For this, I use the model dataset. I put the data in long only format.

ModelLong<- model %>% gather(Model,Predictions,9:15)
#rename the models so that they are more clear
levels(ModelLong$Model) <- list(`Model 1 Intercept`="Model1Fitted", `Model 2 Linear`="Model2Fitted", `Model 3 Quadratic`="Model3Fitted", `Model 4 Cubic`="Model4Fitted", `Model 5 Quartic (not converged)`="Model5Fitted", `Model 6 Quartic (not converged)`="Model6Fitted", `Model 7 Quartic`="Model7Fitted")

#rename some of the colums
colnames(ModelLong)[3]<-"Interest Area"
ModelLong[,3]<-factor(ModelLong[,3])
#compress the data(find the average over subjects)
ForGraph1<-ModelLong %>% group_by(`Interest Area`,time, Model) %>%
    summarise(`Fixation Probability`=mean(Prob),Predicted=mean(Predictions))

levels(ForGraph1$`Interest Area`) <- list(`High Competitor`="CompetitorHigh ", `Low Competitor`="CompetitorLow ")

The code chunk below will plot and save the graph

#color palette for color-blind
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
#create plot1 for Experiment 1, which goes in the publication

g3<-ggplot(ForGraph1, aes(x=time-2025, y=`Fixation Probability`, shape=`Interest Area`, color=`Interest Area`)) +
    geom_point(size=3) + geom_line(aes(y=Predicted), size=1)+facet_wrap(~Model, nrow=3, ncol=3)+
    scale_fill_manual(values=cbPalette)+scale_colour_manual(values=cbPalette)+
    ggtitle("Fixation Proportion for Models\n examining High vs. Low Competitors")+
    theme_bw()+
    scale_x_continuous(breaks=seq(0,1000,100), limits=c(0,1000), name="time 200ms after spoken word onset(ms)")+
    theme(legend.text = element_text(size = 12))+
    theme(legend.title = element_text(size=12, face="bold"))+
    theme(axis.title.x = element_text(face="bold", size=15), axis.text.x= element_text(size=12, angle=90, vjust=0.5))+
    theme(axis.title.y = element_text(face="bold", size=15), axis.text.y  = element_text(size=12))+
    theme(strip.text.x = element_text(size=12, face="bold"))+
    theme(plot.title = element_text(lineheight=.8, face="bold", size=14))

g3

ggsave(
  "Plot3Exp1.png",
  g3,
  width = 9.25,
  height = 6.25,
  dpi = 300
)

Session Info

library(beepr)
beep(8)
sessionInfo()
## R version 3.2.3 (2015-12-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 7 x64 (build 7601) Service Pack 1
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] beepr_1.2         caret_6.0-64      lattice_0.20-33  
##  [4] gridExtra_2.2.1   tidyr_0.4.1       doParallel_1.0.10
##  [7] iterators_1.0.8   foreach_1.4.3     dplyr_0.4.3      
## [10] lmerTest_2.0-30   lme4_1.1-11       Matrix_1.2-4     
## [13] ggplot2_2.1.0    
## 
## loaded via a namespace (and not attached):
##  [1] reshape2_1.4.1      splines_3.2.3       colorspace_1.2-6   
##  [4] htmltools_0.3       stats4_3.2.3        yaml_2.1.13        
##  [7] mgcv_1.8-12         survival_2.38-3     nloptr_1.0.4       
## [10] foreign_0.8-66      DBI_0.3.1           RColorBrewer_1.1-2 
## [13] audio_0.1-5         plyr_1.8.3          stringr_1.0.0      
## [16] MatrixModels_0.4-1  munsell_0.4.3       gtable_0.2.0       
## [19] codetools_0.2-14    evaluate_0.8.3      labeling_0.3       
## [22] latticeExtra_0.6-28 knitr_1.12.3        SparseM_1.7        
## [25] quantreg_5.21       pbkrtest_0.4-6      Rcpp_0.12.4        
## [28] acepack_1.3-3.3     scales_0.4.0        formatR_1.3        
## [31] Hmisc_3.17-2        digest_0.6.9        stringi_1.0-1      
## [34] grid_3.2.3          tools_3.2.3         magrittr_1.5       
## [37] lazyeval_0.1.10     Formula_1.2-1       cluster_2.0.3      
## [40] car_2.1-1           MASS_7.3-45         assertthat_0.1     
## [43] minqa_1.2.4         rmarkdown_0.9.5     R6_2.1.2           
## [46] rpart_4.1-10        nnet_7.3-12         nlme_3.1-126