de Rechteren van Hemert, A. 1 , Nota, A. 1 , Schouwenaars, A. 2 & van Hout, A. 1
1 University of Groningen
2 University of Oldenburg
Machine learning methodology is a valuable and complementary approach to Frequentist hypothesis testing. Instead of top-down hypothesis testing, it allows for bottom-up, unbiased data exploration. We show the outcomes of using machine learning methods in syntactic processing research, with particular emphasis on applying Support Vector Machines (SVMs). SVM is a classification algorithm that separates data in two classes. We used this algorithm to classify eye movement data of native Dutch speakers (L1; N=17) and native German speakers who are highly proficient in Dutch (L2; N=18). The goal was to examine if we can predict L1 and L2 class-membership in a syntactic processing task purely on the basis of eye gaze data. Using leave-one-trial-out-cross-validation it turned out that individuals were highly distinguishable (independent of their L1/L2 class-membership). In contrast, in a leave-one-subject-out-cross-validation, classification accuracy of L1 versus L2 was low. These results suggest that there is no information available contrasting L1 versus L2 processing in eye gaze data in these particular groups. Instead, individuals have a unique signature in their eye gaze during the task, which does not depend on L1/L2 syntactic processing.