Finite mixture models are useful tools for clustering two-way data sets within a sound statistical framework which can assess some important questions, such as how many clusters there are in the data. Models have been proposed that can also be used for clustering multilevel data, with the intent to produce clusterings of units at every level on the basis of all the available variables, considering the hierarchical structure of the data set. This paper introduces a new class of mixture models for data sets with two levels that makes it possible to discover a clustering of level-2 units and different clusterings of level-1 units corresponding to different subsets of the variables (multiple cluster structures). This new class is obtained by adapting a mixture model proposed to identify multiple cluster structures in a data matrix to the multilevel situation. The usefulness of the new method is shown using simulated data and a real example.
Galimberti G., Soffritti G. (2010). Finite mixture models for clustering multilevel data with multiple cluster structures. STATISTICAL MODELLING, 10(3), 265-290.
Finite mixture models for clustering multilevel data with multiple cluster structures
GALIMBERTI, GIULIANO;SOFFRITTI, GABRIELE
2010
Abstract
Finite mixture models are useful tools for clustering two-way data sets within a sound statistical framework which can assess some important questions, such as how many clusters there are in the data. Models have been proposed that can also be used for clustering multilevel data, with the intent to produce clusterings of units at every level on the basis of all the available variables, considering the hierarchical structure of the data set. This paper introduces a new class of mixture models for data sets with two levels that makes it possible to discover a clustering of level-2 units and different clusterings of level-1 units corresponding to different subsets of the variables (multiple cluster structures). This new class is obtained by adapting a mixture model proposed to identify multiple cluster structures in a data matrix to the multilevel situation. The usefulness of the new method is shown using simulated data and a real example.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.