For clustering multivariate categorical data, a latent class model-based approach (LCC) with local independence is compared with a distance-based approach, namely partitioning around medoids (PAM). A comprehensive simulation study was evaluated by both a model-based as well as a distance-based criterion. LCC was better according to the model-based criterion and PAM was sometimes better according to the distance-based criterion. However, LCC had an overall good and sometimes better distance-based performance as PAM, although this was not the case in a real data set on tribal art items.
Laura Anderlucci, Christian Hennig (2014). Clustering of categorical data: a comparison of a model-based and a distance-based approach. COMMUNICATIONS IN STATISTICS. THEORY AND METHODS, 43(4), 704-721 [10.1080/03610926.2013.806665].
Clustering of categorical data: a comparison of a model-based and a distance-based approach
ANDERLUCCI, LAURA;HENNIG, CHRISTIAN MARTIN
2014
Abstract
For clustering multivariate categorical data, a latent class model-based approach (LCC) with local independence is compared with a distance-based approach, namely partitioning around medoids (PAM). A comprehensive simulation study was evaluated by both a model-based as well as a distance-based criterion. LCC was better according to the model-based criterion and PAM was sometimes better according to the distance-based criterion. However, LCC had an overall good and sometimes better distance-based performance as PAM, although this was not the case in a real data set on tribal art items.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.