Model-based clustering approaches generally assume that the observations to be clustered are generated from a mixture of distributions, each component of the mixture corresponding to a particular parametric distribution. Most commonly, the underlying distribution is assumed to be normal, which is inadequate for many situations, for example when skewness or multimodality is present within the components. The problem is intensified when the data dimension increases, leading to inaccurate groupings and incorrect inference. A new Bayesian model-based clustering approach is proposed, that can handle a variety of complexities in the data, based on a recently introduced family of geometric skew normal distributions. The performance of this methodology is illustrated through a number of simulation studies and applications to a number of datasets from genomics and medicine.
Edoardo Redivo, Hien D. Nguyen, Mayetri Gupta (2020). Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 152, 1-22 [10.1016/j.csda.2020.107040].
Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions
Edoardo Redivo;
2020
Abstract
Model-based clustering approaches generally assume that the observations to be clustered are generated from a mixture of distributions, each component of the mixture corresponding to a particular parametric distribution. Most commonly, the underlying distribution is assumed to be normal, which is inadequate for many situations, for example when skewness or multimodality is present within the components. The problem is intensified when the data dimension increases, leading to inaccurate groupings and incorrect inference. A new Bayesian model-based clustering approach is proposed, that can handle a variety of complexities in the data, based on a recently introduced family of geometric skew normal distributions. The performance of this methodology is illustrated through a number of simulation studies and applications to a number of datasets from genomics and medicine.File | Dimensione | Formato | |
---|---|---|---|
Bayesian Clustering of skewed and multimodal data using geometric skewed normal distributions.pdf
Open Access dal 01/07/2022
Descrizione: AAM
Tipo:
Postprint
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
6.8 MB
Formato
Adobe PDF
|
6.8 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.