CRIS Current Research Information System

We introduce a dimension reduction method for visualizing the clustering structure obtained from a finite mixture of Gaussian densities. Information on the dimension reduction subspace is obtained from the variation on group means and, depending on the estimated mixture model, on the variation on group covariances. The proposed method aims at reducing the dimensionality by identifying a set of linear combinations, ordered by importance as quantified by the associated eigenvalues, of the original features which capture most of the cluster structure contained in the data. Observations may then be projected onto such a reduced subspace, thus providing summary plots which help to visualize the clustering structure. These plots can be particularly appealing in the case of high-dimensional data and noisy structure. The new constructed variables capture most of the clustering information available in the data, and they can be further reduced to improve clustering performance. We illustrate the approach on both simulated and real data sets. © 2009 Springer Science+Business Media, LLC.

Scrucca, L. (2010). Dimension reduction for model-based clustering. STATISTICS AND COMPUTING, 20(4), 471-484 [10.1007/s11222-009-9138-7].

Dimension reduction for model-based clustering

Scrucca L.

2010

Abstract

We introduce a dimension reduction method for visualizing the clustering structure obtained from a finite mixture of Gaussian densities. Information on the dimension reduction subspace is obtained from the variation on group means and, depending on the estimated mixture model, on the variation on group covariances. The proposed method aims at reducing the dimensionality by identifying a set of linear combinations, ordered by importance as quantified by the associated eigenvalues, of the original features which capture most of the cluster structure contained in the data. Observations may then be projected onto such a reduced subspace, thus providing summary plots which help to visualize the clustering structure. These plots can be particularly appealing in the case of high-dimensional data and noisy structure. The new constructed variables capture most of the clustering information available in the data, and they can be further reduced to improve clustering performance. We illustrate the approach on both simulated and real data sets. © 2009 Springer Science+Business Media, LLC.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2010
			
	Rivista
	
				STATISTICS AND COMPUTING
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s11222-009-9138-7
			
	Citazione
	
				Scrucca, L. (2010). Dimension reduction for model-based clustering. STATISTICS AND COMPUTING, 20(4), 471-484 [10.1007/s11222-009-9138-7].
			
	Tutti gli autori
	
						Scrucca, L.

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/997671

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

53

47

ND

social impact