A model-based clustering approach which contextually performs dimension reduction and variable selection is presented. Dimension reduction is achieved by assuming that the data have been generated by a linear factor model with latent variables modeled as Gaussian mixtures. Variable selection is performed by shrinking the factor loadings though a penalized likelihood method with an L1 penalty. A maximum likelihood estimation procedure via the EM algorithm is developed and a modified BIC criterion to select the penalization parameter is illustrated. The effectiveness of the proposed model is explored in a Monte Carlo simulation study and in a real example.
G. Galimberti, A. Montanari, C. Viroli (2009). Penalized factor mixture analysis for variable selection in clustered data. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 53, 4301-4310 [10.1016/j.csda.2009.05.025].
Penalized factor mixture analysis for variable selection in clustered data
GALIMBERTI, GIULIANO;MONTANARI, ANGELA;VIROLI, CINZIA
2009
Abstract
A model-based clustering approach which contextually performs dimension reduction and variable selection is presented. Dimension reduction is achieved by assuming that the data have been generated by a linear factor model with latent variables modeled as Gaussian mixtures. Variable selection is performed by shrinking the factor loadings though a penalized likelihood method with an L1 penalty. A maximum likelihood estimation procedure via the EM algorithm is developed and a modified BIC criterion to select the penalization parameter is illustrated. The effectiveness of the proposed model is explored in a Monte Carlo simulation study and in a real example.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.