CRIS Current Research Information System

An open question in language comprehension studies is whether non-compositional multiword expressions like idioms and compositional-but-frequent word sequences are processed differently. Are the latter constructed online, or are instead directly retrieved from the lexicon, with a degree of entrenchment depending on their frequency? In this paper, we address this question with two different methodologies. First, we set up a self-paced reading experiment comparing human reading times for idioms and both highfrequency and low-frequency compositional word sequences. Then, we ran the same experiment using the Surprisal metrics computed with Neural Language Models (NLMs). Our results provide evidence that idiomatic and high-frequency compositional expressions are processed similarly by both humans and NLMs. Additional experiments were run to test the possible factors that could affect the NLMs’ performance.

Giulia Rambelli, E.C. (2023). Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-paced Reading and Language Models. Stroudsburg : Association for Computational Linguistics.

Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-paced Reading and Language Models

Giulia Rambelli;Emmanuele Chersoni;Marco S. G. Senaldi;Philippe Blache;Alessandro Lenci

2023

Abstract

An open question in language comprehension studies is whether non-compositional multiword expressions like idioms and compositional-but-frequent word sequences are processed differently. Are the latter constructed online, or are instead directly retrieved from the lexicon, with a degree of entrenchment depending on their frequency? In this paper, we address this question with two different methodologies. First, we set up a self-paced reading experiment comparing human reading times for idioms and both highfrequency and low-frequency compositional word sequences. Then, we ran the same experiment using the Surprisal metrics computed with Neural Language Models (NLMs). Our results provide evidence that idiomatic and high-frequency compositional expressions are processed similarly by both humans and NLMs. Additional experiments were run to test the possible factors that could affect the NLMs’ performance.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Titolo del volume
	
				Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023)
			
	Pagina iniziale
	
				87
			
	Pagina finale
	
				98
			
	Citazione
	
				Giulia Rambelli, E.C. (2023). Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-paced Reading and Language Models. Stroudsburg : Association for Computational Linguistics.
			
	Tutti gli autori
	
						Giulia Rambelli, Emmanuele Chersoni , Marco S.G. Senaldi , Philippe Blache , Alessandro Lenci
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2023.mwe-1.13.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Creative commons Dimensione 2.55 MB Formato Adobe PDF Visualizza/Apri	2.55 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/925638

Citazioni

ND

ND

ND

social impact