In natural language generation, abstractive summarization (AS) is advancing rapidly due to transformer-based language models (LMs). Although decoding strategies significantly influence generated summaries, their significance is often overlooked. Given the abundance of token selection heuristics and associated hyperparameters, the community needs guidance to make well-informed decisions based on the specific task and target metrics. To address this gap, we conduct a comparative assessment of the effectiveness and efficiency of decoding-time techniques for short, long, and multi-document AS. We explore over 3,500 combinations involving three widely used million-scale autoregressive encoder-decoder LMs, two billion-scale decoder-only LMs, six datasets, and nine decoding settings. Our findings highlight that optimized decoding choices can lead to substantial performance improvements. Alongside human evaluation, we quantitatively measure effects using ten automatic metrics, covering dimensions such as semantic similarity, factuality, compression, redundancy, and carbon footprint. To set the stage for differentiable selection and optimization of decoding options, we introduce PRISM, a first-of-its-kind dataset that pairs AS gold input-output examples with our LM predictions across a diverse range of decoding options.

Frisoni, G., Ragazzi, L., Cohen, D., Moro, G., Carbonaro, A., Sartori, C. (2026). Abstractive Summarization through the Prism of Decoding Strategies. NEURAL NETWORKS, 195, 1-32 [10.1016/j.neunet.2025.108249].

Abstractive Summarization through the Prism of Decoding Strategies

Giacomo Frisoni
Co-primo
;
Luca Ragazzi
Co-primo
;
Gianluca Moro
Co-primo
;
Antonella Carbonaro
Co-primo
;
Claudio Sartori
Co-primo
2026

Abstract

In natural language generation, abstractive summarization (AS) is advancing rapidly due to transformer-based language models (LMs). Although decoding strategies significantly influence generated summaries, their significance is often overlooked. Given the abundance of token selection heuristics and associated hyperparameters, the community needs guidance to make well-informed decisions based on the specific task and target metrics. To address this gap, we conduct a comparative assessment of the effectiveness and efficiency of decoding-time techniques for short, long, and multi-document AS. We explore over 3,500 combinations involving three widely used million-scale autoregressive encoder-decoder LMs, two billion-scale decoder-only LMs, six datasets, and nine decoding settings. Our findings highlight that optimized decoding choices can lead to substantial performance improvements. Alongside human evaluation, we quantitatively measure effects using ten automatic metrics, covering dimensions such as semantic similarity, factuality, compression, redundancy, and carbon footprint. To set the stage for differentiable selection and optimization of decoding options, we introduce PRISM, a first-of-its-kind dataset that pairs AS gold input-output examples with our LM predictions across a diverse range of decoding options.
2026
Frisoni, G., Ragazzi, L., Cohen, D., Moro, G., Carbonaro, A., Sartori, C. (2026). Abstractive Summarization through the Prism of Decoding Strategies. NEURAL NETWORKS, 195, 1-32 [10.1016/j.neunet.2025.108249].
Frisoni, Giacomo; Ragazzi, Luca; Cohen, David; Moro, Gianluca; Carbonaro, Antonella; Sartori, Claudio
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S089360802501130X-main.pdf

accesso aperto

Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 9.31 MB
Formato Adobe PDF
9.31 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1027360
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact