This study investigates how Whisper handles interactional phenomena in spontaneous Italian conversation, focusing on backchannels, repairs, and filled pauses. We compare standard Word Error Rate (WER) optimization with a decoding strategy that explicitly rewards the preservation of interactional events. Results show that decoding choices have limited impact on overall accuracy, while recognition remains strongly phenomenon-dependent, suggesting structural limitations in the handling of interactional phenomena, with systematic linearization of repairs and frequent suppression of short conversational items.

Simonotti, M., Pannitto, L., Mauri, C., Ferraresi, A., Carioli, G. (2026). Say again? The limits of Whisper with conversation: A case study on the KIParla corpus. Language Resources Association (ELRA).

Say again? The limits of Whisper with conversation: A case study on the KIParla corpus

Martina Simonotti
;
Ludovica Pannitto;Caterina Mauri;Adriano Ferraresi;Gabriele Carioli
2026

Abstract

This study investigates how Whisper handles interactional phenomena in spontaneous Italian conversation, focusing on backchannels, repairs, and filled pauses. We compare standard Word Error Rate (WER) optimization with a decoding strategy that explicitly rewards the preservation of interactional events. Results show that decoding choices have limited impact on overall accuracy, while recognition remains strongly phenomenon-dependent, suggesting structural limitations in the handling of interactional phenomena, with systematic linearization of repairs and frequent suppression of short conversational items.
2026
Speech Language Models in Low-Resource Settings: Performance, Evaluation, and Bias Analysis (SPEAKABLE) @ LREC 2026
16
30
Simonotti, M., Pannitto, L., Mauri, C., Ferraresi, A., Carioli, G. (2026). Say again? The limits of Whisper with conversation: A case study on the KIParla corpus. Language Resources Association (ELRA).
Simonotti, Martina; Pannitto, Ludovica; Mauri, Caterina; Ferraresi, Adriano; Carioli, Gabriele
File in questo prodotto:
File Dimensione Formato  
2026.speakable-1.0.pdf

accesso aperto

Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale (CCBYNC)
Dimensione 1.2 MB
Formato Adobe PDF
1.2 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1069387
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact