What does memory retrieval leave on the table? Modelling the Cost of Semi-Compositionality with MINERVA2 and sBERT

Abstract

Despite being ubiquitous in natural language, collocations (e.g., kick+habit) incur a unique processing cost, compared to compositional phrases (kick+door) and idioms (kick+bucket). We confirm this cost with behavioural data as well as MINERVA2, a memory model, suggesting that collocations constitute a distinct linguistic category. While the model fails to fully capture the observed human processing patterns, we find that below a specific item frequency threshold, the model’s retrieval failures align with human reaction times across conditions. This suggests an alternative processing mechanism that activates when memory retrieval fails.

Publication
Proceedings of the 29th Conference on Computational Natural Language Learning