Abstract
For most of the history of psychology, data analysis has focused on techniques like ANOVA and regression, characterize the observed structure in a dataset with respect to a given set of questions or factors.? Such models make simplifying assumptions about the data as a while, while allowing targeted questions about the observed differences. However, in the last decade the field has moved to models starting from a different set of assumptions. These models characterize the underlying probability structure that gave rise to all of the data, and then use this to make experimental inferences.? Nowhere is this contrast more apparent than in the analysis of rich timeseries data such as those generated by techniques like eye-tracking, pupillometry, EEG and so forth.? This talk explores this contrast in the context of a popular technique in psycholinguistics and language acquisition: the visual world paradigm, which uses eye-movements in a semi-naturalistic task to make inferences about the millisecond-by-millisecond dynamics of language processing.? Here, the auto-correlated timeseries and the incredibly rich dataset seems to appear to require highly complex approaches to analysis such as generalized additive models and growth curves that can fully capture this variance. But is this necessary? This talk challenges this approach on two grounds.? First, while the movement toward greater rigor and reproducibility seems to favor more statistical precision, others have argued that equally important is a well specified linking function or derivation chain that can yield precise confirmatory predictions from a computational model.? I argue that existing statistical approaches are often overspecified and suffer from too much complexity permit this kind of clarity.? Instead, I argue for a return to an older approach: inference or index approaches to analysis and I present several examples of advances in this approach that push beyond simple ANOVAs.? Second, I present a series of Monte-Carlo simulations that directly investigate the derivation chain to ask what is the nature of the timeseries that is the basis of analysis?? While most approaches (including my own) simply average the series of fixat