For clustering multivariate categorical data, a latent class model-based approach (LCC) with local independence is compared with a distance-based approach, namely partitioning around medoids (PAM). A comprehensive simulation study was evaluated by both a model-based as well as a distance-based criterion. LCC was better according to the model-based criterion and PAM was sometimes better according to the distance-based criterion. However, LCC had an overall good and sometimes better distance-based performance as PAM, although this was not the case in a real data set on tribal art items.
Clustering of categorical data: a comparison of a model-based and a distance-based approach / Laura Anderlucci; Christian Hennig. - In: COMMUNICATIONS IN STATISTICS. THEORY AND METHODS. - ISSN 0361-0926. - STAMPA. - 43:4(2014), pp. 704-721. [10.1080/03610926.2013.806665]
Clustering of categorical data: a comparison of a model-based and a distance-based approach
ANDERLUCCI, LAURA;HENNIG, CHRISTIAN MARTIN
2014
Abstract
For clustering multivariate categorical data, a latent class model-based approach (LCC) with local independence is compared with a distance-based approach, namely partitioning around medoids (PAM). A comprehensive simulation study was evaluated by both a model-based as well as a distance-based criterion. LCC was better according to the model-based criterion and PAM was sometimes better according to the distance-based criterion. However, LCC had an overall good and sometimes better distance-based performance as PAM, although this was not the case in a real data set on tribal art items.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.