CRIS Current Research Information System

Recurrent neural networks (RNNs) are state-of-the-art in voice awareness/understanding and speech recognition. On-device computation of RNNs on low-power mobile and wearable devices would be key to applications such as zero-latency voice-based human-machine interfaces. Here we present CHIPMUNK, a small (<1mm2) hardware accelerator for Long-Short Term Memory RNNs in UMC 65 nm technology capable to operate at a measured peak efficiency up to 3.08Gop/s/mW at 1.24 mW peak power. To implement big RNN models without incurring in huge memory transfer overhead, multiple CHIPMUNK engines can cooperate to form a single systolic array. In this way, the Chipmunk architecture in a 75 tiles configuration can achieve real-time phoneme extraction on a demanding RNN topology proposed in [1], consuming less than 13 mW of average power.

Chipmunk: A systolically scalable 0.9 mm2, 3.08Gop/s/mW @ 1.2 mW accelerator for near-sensor recurrent neural network inference / Conti, Francesco; Cavigelli, Lukas; Paulin, Gianna; Susmelj, Igor; Benini, Luca. - ELETTRONICO. - (2018), pp. 1-4. (Intervento presentato al convegno 2018 IEEE Custom Integrated Circuits Conference, CICC 2018 tenutosi a San Diego, CA, USA nel 8-11 April 2018) [10.1109/CICC.2018.8357068].

Chipmunk: A systolically scalable 0.9 mm2, 3.08Gop/s/mW @ 1.2 mW accelerator for near-sensor recurrent neural network inference

Conti, Francesco;Cavigelli, Lukas;Paulin, Gianna;Susmelj, Igor;Benini, Luca

2018

Abstract

Recurrent neural networks (RNNs) are state-of-the-art in voice awareness/understanding and speech recognition. On-device computation of RNNs on low-power mobile and wearable devices would be key to applications such as zero-latency voice-based human-machine interfaces. Here we present CHIPMUNK, a small (<1mm2) hardware accelerator for Long-Short Term Memory RNNs in UMC 65 nm technology capable to operate at a measured peak efficiency up to 3.08Gop/s/mW at 1.24 mW peak power. To implement big RNN models without incurring in huge memory transfer overhead, multiple CHIPMUNK engines can cooperate to form a single systolic array. In this way, the Chipmunk architecture in a 75 tiles configuration can achieve real-time phoneme extraction on a demanding RNN topology proposed in [1], consuming less than 13 mW of average power.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2018
		
	Titolo del volume
	
			2018 IEEE Custom Integrated Circuits Conference, CICC 2018
		
	Pagina iniziale
	
			1
		
	Pagina finale
	
			4
		
	Codice DOI
	
			https://dx.doi.org/10.1109/CICC.2018.8357068
		
	Citazione
	
			Chipmunk: A systolically scalable 0.9 mm2, 3.08Gop/s/mW @ 1.2 mW accelerator for near-sensor recurrent neural network inference / Conti, Francesco; Cavigelli, Lukas; Paulin, Gianna; Susmelj, Igor; Benini, Luca. - ELETTRONICO. - (2018), pp. 1-4. (Intervento presentato al  convegno 2018 IEEE Custom Integrated Circuits Conference, CICC 2018 tenutosi a San Diego, CA, USA nel 8-11 April 2018) [10.1109/CICC.2018.8357068].
		
	Tutti gli autori
	
			Conti, Francesco; Cavigelli, Lukas; Paulin, Gianna; Susmelj, Igor; Benini, Luca
		
	Appare nelle tipologie:
	
			4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Binder4.pdf accesso aperto Tipo: Postprint Licenza: Licenza per accesso libero gratuito Dimensione 1.19 MB Formato Adobe PDF Visualizza/Apri	1.19 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/652926

Citazioni

ND

36

26

social impact