An open question in language comprehension studies is whether non-compositional multiword expressions like idioms and compositional-but-frequent word sequences are processed differently. Are the latter constructed online, or are instead directly retrieved from the lexicon, with a degree of entrenchment depending on their frequency? In this paper, we address this question with two different methodologies. First, we set up a self-paced reading experiment comparing human reading times for idioms and both highfrequency and low-frequency compositional word sequences. Then, we ran the same experiment using the Surprisal metrics computed with Neural Language Models (NLMs). Our results provide evidence that idiomatic and high-frequency compositional expressions are processed similarly by both humans and NLMs. Additional experiments were run to test the possible factors that could affect the NLMs’ performance.

Giulia Rambelli, E.C. (2023). Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-paced Reading and Language Models. Stroudsburg : Association for Computational Linguistics.

Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-paced Reading and Language Models

Giulia Rambelli
;
2023

Abstract

An open question in language comprehension studies is whether non-compositional multiword expressions like idioms and compositional-but-frequent word sequences are processed differently. Are the latter constructed online, or are instead directly retrieved from the lexicon, with a degree of entrenchment depending on their frequency? In this paper, we address this question with two different methodologies. First, we set up a self-paced reading experiment comparing human reading times for idioms and both highfrequency and low-frequency compositional word sequences. Then, we ran the same experiment using the Surprisal metrics computed with Neural Language Models (NLMs). Our results provide evidence that idiomatic and high-frequency compositional expressions are processed similarly by both humans and NLMs. Additional experiments were run to test the possible factors that could affect the NLMs’ performance.
2023
Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023)
87
98
Giulia Rambelli, E.C. (2023). Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-paced Reading and Language Models. Stroudsburg : Association for Computational Linguistics.
Giulia Rambelli, Emmanuele Chersoni , Marco S.G. Senaldi , Philippe Blache , Alessandro Lenci
File in questo prodotto:
File Dimensione Formato  
2023.mwe-1.13.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Creative commons
Dimensione 2.55 MB
Formato Adobe PDF
2.55 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/925638
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact