Model-based clustering approaches generally assume that the observations to be clustered are generated from a mixture of distributions, each component of the mixture corresponding to a particular parametric distribution. Most commonly, the underlying distribution is assumed to be normal, which is inadequate for many situations, for example when skewness or multimodality is present within the components. The problem is intensified when the data dimension increases, leading to inaccurate groupings and incorrect inference. A new Bayesian model-based clustering approach is proposed, that can handle a variety of complexities in the data, based on a recently introduced family of geometric skew normal distributions. The performance of this methodology is illustrated through a number of simulation studies and applications to a number of datasets from genomics and medicine.

Edoardo Redivo, Hien D. Nguyen, Mayetri Gupta (2020). Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 152, 1-22 [10.1016/j.csda.2020.107040].

Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions

Edoardo Redivo;
2020

Abstract

Model-based clustering approaches generally assume that the observations to be clustered are generated from a mixture of distributions, each component of the mixture corresponding to a particular parametric distribution. Most commonly, the underlying distribution is assumed to be normal, which is inadequate for many situations, for example when skewness or multimodality is present within the components. The problem is intensified when the data dimension increases, leading to inaccurate groupings and incorrect inference. A new Bayesian model-based clustering approach is proposed, that can handle a variety of complexities in the data, based on a recently introduced family of geometric skew normal distributions. The performance of this methodology is illustrated through a number of simulation studies and applications to a number of datasets from genomics and medicine.
2020
Edoardo Redivo, Hien D. Nguyen, Mayetri Gupta (2020). Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 152, 1-22 [10.1016/j.csda.2020.107040].
Edoardo Redivo; Hien D. Nguyen; Mayetri Gupta
File in questo prodotto:
File Dimensione Formato  
Bayesian Clustering of skewed and multimodal data using geometric skewed normal distributions.pdf

Open Access dal 01/07/2022

Descrizione: AAM
Tipo: Postprint
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 6.8 MB
Formato Adobe PDF
6.8 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/955862
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact