Deep ensembles and data augmentation for semantic segmentation

Nanni, L.; Lumini, A.; Loreggia, A.; Brahnam, S.; Cuza, D.

doi:10.1016/B978-0-323-96129-5.00009-3

The task of classifying each pixel in an image is known as semantic segmentation in the context of computer vision, and it is critical for image analysis in many domains. Semantic segmentation, for example, is required in clinical practice to improve accuracy in identifying potential pathologies, such as polyp segmentation, which provides critical information for detecting colorectal cancer in its early stages. Autoencoder architectures that learn low-level semantical descriptions of an image are commonly used for semantic segmentation. This architecture is made up of an encoder module that generates low-level data representations, which are then used by a second module (the decoder) that learns to rebuild the initial input. We tackle the semantic segmentation process in this chapter by constructing a novel ensemble of convolutional neural networks (CNNs) and transformers. An ensemble is a machine learning method that trains different models to make predictions on a given input, and then aggregates these predictions to compute a final decision. We enforce ensemble diversity by experimenting with various loss functions and data augmentation approaches. We combine DeepLabV3+, HarDNet-MSEG CNN, and Pyramid Vision Transformers to create the proposed ensemble. We present a thorough empirical analysis of our system based on three semantic segmentation problems: polyp detection, skin detection, and leukocyte recognition. Experiments show that our method produces cutting-edge results.

Nanni L., Lumini A., Loreggia A., Brahnam S., Cuza D. (2023). Deep ensembles and data augmentation for semantic segmentation. Amsterdam : Elsevier [10.1016/B978-0-323-96129-5.00009-3].

Deep ensembles and data augmentation for semantic segmentation

Nanni L.;Lumini A.;Loreggia A.;Brahnam S.;Cuza D.

2023

Abstract

The task of classifying each pixel in an image is known as semantic segmentation in the context of computer vision, and it is critical for image analysis in many domains. Semantic segmentation, for example, is required in clinical practice to improve accuracy in identifying potential pathologies, such as polyp segmentation, which provides critical information for detecting colorectal cancer in its early stages. Autoencoder architectures that learn low-level semantical descriptions of an image are commonly used for semantic segmentation. This architecture is made up of an encoder module that generates low-level data representations, which are then used by a second module (the decoder) that learns to rebuild the initial input. We tackle the semantic segmentation process in this chapter by constructing a novel ensemble of convolutional neural networks (CNNs) and transformers. An ensemble is a machine learning method that trains different models to make predictions on a given input, and then aggregates these predictions to compute a final decision. We enforce ensemble diversity by experimenting with various loss functions and data augmentation approaches. We combine DeepLabV3+, HarDNet-MSEG CNN, and Pyramid Vision Transformers to create the proposed ensemble. We present a thorough empirical analysis of our system based on three semantic segmentation problems: polyp detection, skin detection, and leukocyte recognition. Experiments show that our method produces cutting-edge results.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Titolo del volume
	
				Diagnostic Biomedical Signal and Image Processing Applications with Deep Learning Methods
			
	Pagina iniziale
	
				215
			
	Pagina finale
	
				234
			
	Codice DOI
	
				https://dx.doi.org/10.1016/B978-0-323-96129-5.00009-3
			
	Citazione
	
				Nanni L.,  Lumini A.,  Loreggia A.,  Brahnam S.,  Cuza D. (2023). Deep ensembles and data augmentation for semantic segmentation. Amsterdam : Elsevier [10.1016/B978-0-323-96129-5.00009-3].
			
	Tutti gli autori
	
						Nanni L.; Lumini A.; Loreggia A.; Brahnam S.; Cuza D.

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/959155

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

5

ND

CRIS Current Research Information System