CRIS Current Research Information System

An aspect of cluster analysis which has been widely studied in recent years is the weighting and selection of variables. Procedures have been proposed which are able to identify the cluster structure present in a data matrix when that structure is confined to a subset of variables. Other methods assess the relative importance of each variable as revealed by a suitably chosen weight. But when a cluster structure is present in more than one subset of variables and is different from one subset to another, those solutions as well as standard clustering algorithms can lead to misleading results. Some very recent methodologies for finding consensus classifications of the same set of units can be useful also for the identification of cluster structures in a data matrix, but each one seems to be only partly satisfactory for the purpose at hand. Therefore a new more specific procedure is proposed and illustrated by analyzing two real data sets; its performances are evaluated by means of a simulation experiment.

Soffritti G. (2003). Identifying Multiple Cluster Structures in a Data Matrix. COMMUNICATIONS IN STATISTICS. SIMULATION AND COMPUTATION, 32(4), 1151-1177 [10.1081/SAC-120023883].

Identifying Multiple Cluster Structures in a Data Matrix

Soffritti G.^Primo

2003

Abstract

An aspect of cluster analysis which has been widely studied in recent years is the weighting and selection of variables. Procedures have been proposed which are able to identify the cluster structure present in a data matrix when that structure is confined to a subset of variables. Other methods assess the relative importance of each variable as revealed by a suitably chosen weight. But when a cluster structure is present in more than one subset of variables and is different from one subset to another, those solutions as well as standard clustering algorithms can lead to misleading results. Some very recent methodologies for finding consensus classifications of the same set of units can be useful also for the identification of cluster structures in a data matrix, but each one seems to be only partly satisfactory for the purpose at hand. Therefore a new more specific procedure is proposed and illustrated by analyzing two real data sets; its performances are evaluated by means of a simulation experiment.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2003
			
	Rivista
	
				COMMUNICATIONS IN STATISTICS. SIMULATION AND COMPUTATION
			
	Codice DOI
	
				https://dx.doi.org/10.1081/SAC-120023883
			
	Citazione
	
				Soffritti G. (2003). Identifying Multiple Cluster Structures in a Data Matrix. COMMUNICATIONS IN STATISTICS. SIMULATION AND COMPUTATION, 32(4), 1151-1177 [10.1081/SAC-120023883].
			
	Tutti gli autori
	
						Soffritti G.

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/916983

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

8

10

ND

social impact