CRIS Current Research Information System

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

Attention in Natural Language Processing / Andrea Galassi; Marco Lippi; Paolo Torroni. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-2388. - STAMPA. - 32:10(2021), pp. 4291-4308. [10.1109/TNNLS.2020.3019893]

Attention in Natural Language Processing

Andrea Galassi;Marco Lippi;Paolo Torroni

2021

Abstract

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2021
		
	Rivista
	
			IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
		
	Codice DOI
	
			https://dx.doi.org/10.1109/TNNLS.2020.3019893
		
	Citazione
	
			Attention in Natural Language Processing / Andrea Galassi; Marco Lippi; Paolo Torroni. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-2388. - STAMPA. - 32:10(2021), pp. 4291-4308. [10.1109/TNNLS.2020.3019893]
		
	Tutti gli autori
	
			Andrea Galassi; Marco Lippi; Paolo Torroni
		
	Appare nelle tipologie:
	
			1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
AttentionPrePub.pdf accesso aperto Descrizione: Pre-publication Tipo: Postprint Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 2.73 MB Formato Adobe PDF Visualizza/Apri	2.73 MB	Adobe PDF	Visualizza/Apri
attention-in-natural-language-processing.pdf accesso aperto Descrizione: Published paper Tipo: Versione (PDF) editoriale Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 2.57 MB Formato Adobe PDF Visualizza/Apri	2.57 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/663866

Citazioni

25

243

223

social impact