GPU Strategies for Distance-Based Outlier Detection

Fabrizio, Angiulli; Basta, Stefano; Lodi, Stefano; Sartori, Claudio

doi:10.1109/TPDS.2016.2528984

The process of discovering interesting patterns in large, possibly huge, data sets is referred to as data mining, and can be performed in several flavours, known as "data mining functions." Among these functions, outlier detection discovers observations which deviate substantially from the rest of the data, and has many important practical applications. Outlier detection in very large data sets is however computationally very demanding and currently requires high-performance computing facilities. We propose a family of parallel and distributed algorithms for graphic processing units (GPU) derived from two distance-based outlier detection algorithms: BruteForce and SolvingSet. The algorithms differ in the way they exploit the architecture and memory hierarchy of the GPU and guarantee significant improvements with respect to the CPU versions, both in terms of scalability and exploitation of parallelism. We provide a detailed discussion of their computational properties and measure performances with an extensive experimentation, comparing the several implementations and showing significant speedups. © 2016 IEEE.

Fabrizio, A., Stefano, B., Stefano, L., Claudio, S. (2016). GPU Strategies for Distance-Based Outlier Detection. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 27(11), 3256-3268 [10.1109/TPDS.2016.2528984].

GPU Strategies for Distance-Based Outlier Detection

Fabrizio, Angiulli;BASTA, STEFANO;LODI, STEFANO;SARTORI, CLAUDIO

2016

Abstract

The process of discovering interesting patterns in large, possibly huge, data sets is referred to as data mining, and can be performed in several flavours, known as "data mining functions." Among these functions, outlier detection discovers observations which deviate substantially from the rest of the data, and has many important practical applications. Outlier detection in very large data sets is however computationally very demanding and currently requires high-performance computing facilities. We propose a family of parallel and distributed algorithms for graphic processing units (GPU) derived from two distance-based outlier detection algorithms: BruteForce and SolvingSet. The algorithms differ in the way they exploit the architecture and memory hierarchy of the GPU and guarantee significant improvements with respect to the CPU versions, both in terms of scalability and exploitation of parallelism. We provide a detailed discussion of their computational properties and measure performances with an extensive experimentation, comparing the several implementations and showing significant speedups. © 2016 IEEE.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Rivista
	
				IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TPDS.2016.2528984
			
	Citazione
	
				Fabrizio, A., Stefano, B., Stefano, L., Claudio, S. (2016). GPU Strategies for Distance-Based Outlier Detection. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 27(11), 3256-3268 [10.1109/TPDS.2016.2528984].
			
	Tutti gli autori
	
						Fabrizio, Angiulli; Stefano, Basta; Stefano, Lodi; Claudio, Sartori
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/585645

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

27

21

ND

CRIS Current Research Information System