Long document summarization poses obstacles to current generative transformer-based models because of the broad context to process and understand. Indeed, detecting long-range dependencies is still challenging for today’s state-of-the-art solutions, usually requiring model expansion at the cost of an unsustainable demand for computing and memory capacities. This paper introduces Emma, a novel efficient memory-enhanced transformer-based architecture. By segmenting a lengthy input into multiple text fragments, our model stores and compares the current chunk with previous ones, gaining the capability to read and comprehend the entire context over the whole document with a fixed amount of GPU memory. This method enables the model to deal with theoretically infinitely long documents, using less than 18 and 13 GB of memory for training and inference, respectively. We conducted extensive performance analyses and demonstrate that Emma achieved competitive results on two datasets of different domains while consuming significantly less GPU memory than competitors do, even in low-resource settings.
Moro G., Ragazzi L., Valgimigli L., Frisoni G., Sartori C., Marfia G. (2023). Efficient Memory-Enhanced Transformer for Long-Document Summarization in Low-Resource Regimes. SENSORS, 23(7), 1-16 [10.3390/s23073542].
Efficient Memory-Enhanced Transformer for Long-Document Summarization in Low-Resource Regimes
Moro G.
;Ragazzi L.;Valgimigli L.;Frisoni G.;Sartori C.;Marfia G.
2023
Abstract
Long document summarization poses obstacles to current generative transformer-based models because of the broad context to process and understand. Indeed, detecting long-range dependencies is still challenging for today’s state-of-the-art solutions, usually requiring model expansion at the cost of an unsustainable demand for computing and memory capacities. This paper introduces Emma, a novel efficient memory-enhanced transformer-based architecture. By segmenting a lengthy input into multiple text fragments, our model stores and compares the current chunk with previous ones, gaining the capability to read and comprehend the entire context over the whole document with a fixed amount of GPU memory. This method enables the model to deal with theoretically infinitely long documents, using less than 18 and 13 GB of memory for training and inference, respectively. We conducted extensive performance analyses and demonstrate that Emma achieved competitive results on two datasets of different domains while consuming significantly less GPU memory than competitors do, even in low-resource settings.File | Dimensione | Formato | |
---|---|---|---|
sensors-23-03542-v2.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
594.89 kB
Formato
Adobe PDF
|
594.89 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.