Model-based clustering approaches generally assume that the observations to be clustered are generated from a mixture of distributions, each component of the mixture corresponding to a particular parametric distribution. Most commonly, the underlying distribution is assumed to be normal, which is inadequate for many situations, for example when skewness or multimodality is present within the components. The problem is intensified when the data dimension increases, leading to inaccurate groupings and incorrect inference. A new Bayesian model-based clustering approach is proposed, that can handle a variety of complexities in the data, based on a recently introduced family of geometric skew normal distributions. The performance of this methodology is illustrated through a number of simulation studies and applications to a number of datasets from genomics and medicine.

Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions / Edoardo Redivo; Hien D. Nguyen; Mayetri Gupta. - In: COMPUTATIONAL STATISTICS & DATA ANALYSIS. - ISSN 0167-9473. - ELETTRONICO. - 152:(2020), pp. 107040.1-107040.22. [10.1016/j.csda.2020.107040]

Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions

Edoardo Redivo;
2020

Abstract

Model-based clustering approaches generally assume that the observations to be clustered are generated from a mixture of distributions, each component of the mixture corresponding to a particular parametric distribution. Most commonly, the underlying distribution is assumed to be normal, which is inadequate for many situations, for example when skewness or multimodality is present within the components. The problem is intensified when the data dimension increases, leading to inaccurate groupings and incorrect inference. A new Bayesian model-based clustering approach is proposed, that can handle a variety of complexities in the data, based on a recently introduced family of geometric skew normal distributions. The performance of this methodology is illustrated through a number of simulation studies and applications to a number of datasets from genomics and medicine.
2020
Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions / Edoardo Redivo; Hien D. Nguyen; Mayetri Gupta. - In: COMPUTATIONAL STATISTICS & DATA ANALYSIS. - ISSN 0167-9473. - ELETTRONICO. - 152:(2020), pp. 107040.1-107040.22. [10.1016/j.csda.2020.107040]
Edoardo Redivo; Hien D. Nguyen; Mayetri Gupta
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/955862
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 3
social impact