CRIS Current Research Information System

Semantic segmentation is a very popular topic in modern computer vision, and it has applications in many fields. Researchers have proposed a variety of architectures for semantic image segmentation. The most common ones exploit an encoder–decoder structure that aims to capture the semantics of the image and its low-level features. The encoder uses convolutional layers, in general with a stride larger than one, to extract the features, while the decoder recreates the image by upsampling and using skip connections with the first layers. The objective of this study is to propose a method for creating an ensemble of CNNs by enhancing diversity among networks with different activation functions. In this work, we use DeepLabV3+ as an architecture to test the effectiveness of creating an ensemble of networks by randomly changing the activation functions inside the network multiple times. We also use different backbone networks in our DeepLabV3+ to validate our findings. A comprehensive evaluation of the proposed approach is conducted across two different image segmentation problems: the first is from the medical field, i.e., polyp segmentation for early detection of colorectal cancer, and the second is skin detection for several different applications, including face detection, hand gesture recognition, and many others. As to the first problem, we manage to reach a Dice coefficient of 0.888, and a mean intersection over union (mIoU) of 0.825, in the competitive Kvasir-SEG dataset. The high performance of the proposed ensemble is confirmed in skin detection, where the proposed approach is ranked first concerning other state-of-the-art approaches (including HarDNet) in a large set of testing datasets.

Lumini, A., Nanni, L., Maguolo, G. (2021). Deep Ensembles Based on Stochastic Activations for Semantic Segmentation. SIGNALS, 2(4), 820-833 [10.3390/signals2040047].

Deep Ensembles Based on Stochastic Activations for Semantic Segmentation

Lumini, Alessandra;Nanni, Loris;Maguolo, Gianluca

2021

Abstract

Semantic segmentation is a very popular topic in modern computer vision, and it has applications in many fields. Researchers have proposed a variety of architectures for semantic image segmentation. The most common ones exploit an encoder–decoder structure that aims to capture the semantics of the image and its low-level features. The encoder uses convolutional layers, in general with a stride larger than one, to extract the features, while the decoder recreates the image by upsampling and using skip connections with the first layers. The objective of this study is to propose a method for creating an ensemble of CNNs by enhancing diversity among networks with different activation functions. In this work, we use DeepLabV3+ as an architecture to test the effectiveness of creating an ensemble of networks by randomly changing the activation functions inside the network multiple times. We also use different backbone networks in our DeepLabV3+ to validate our findings. A comprehensive evaluation of the proposed approach is conducted across two different image segmentation problems: the first is from the medical field, i.e., polyp segmentation for early detection of colorectal cancer, and the second is skin detection for several different applications, including face detection, hand gesture recognition, and many others. As to the first problem, we manage to reach a Dice coefficient of 0.888, and a mean intersection over union (mIoU) of 0.825, in the competitive Kvasir-SEG dataset. The high performance of the proposed ensemble is confirmed in skin detection, where the proposed approach is ranked first concerning other state-of-the-art approaches (including HarDNet) in a large set of testing datasets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Rivista
	
				SIGNALS
			
	Codice DOI
	
				https://dx.doi.org/10.3390/signals2040047
			
	Citazione
	
				Lumini, A., Nanni, L., Maguolo, G. (2021). Deep Ensembles Based on Stochastic Activations for Semantic Segmentation. SIGNALS, 2(4), 820-833 [10.3390/signals2040047].
			
	Tutti gli autori
	
						Lumini, Alessandra; Nanni, Loris; Maguolo, Gianluca
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
signals-02-00047 (2).pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 1.6 MB Formato Adobe PDF Visualizza/Apri	1.6 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/849963

Citazioni

ND

5

3

social impact