This work is motivated by the following question: given a sample of compositional data trajectories (i.e. sequences of composition measurements along a domain), how can one propose a segmentation procedure leading to homogeneous classes? In other words, our contribution aims at studying statistical methods suited for clustering compositional data, when the observations are constituted by trajectories of compositional data. Observed trajectories are known as “functional data” and several methods have been proposed for their analysis. In particular, methodologies suited for clustering of trajectories are known as Functional Cluster Analysis (FCA) (Ramsay and Silverman, 2005). However, FCA techniques have not been extended to analyse compositional data trajectories. To this aim, FCA clustering techniques have to be adapted by using a suitable algebra for compositions (Aitchison, 1986). In this work, we propose a methodology consisting in a preliminary smoothing of compositional trajectories, followed by the construction of suitable metrics needed for both partitional and hierarchical clustering. A simulation study is performed in order to check the proposed methodologies. The quality of the obtained results is assessed by means of several indices (Halkidi et al., 2001). Finally, an environmental application is developed. The methodologies are applied to a real dataset containing measurements of particulate matter vertical profile compositions for different days. The aim of the application is to detect typical behaviours (clusters) characterizing the vertical profiles of particulate matter compositions.
Bruno F., Greco F. (2008). Clustering compositional data trajectories. PADOVA : CLEUP.
Clustering compositional data trajectories
BRUNO, FRANCESCA;GRECO, FEDELE PASQUALE
2008
Abstract
This work is motivated by the following question: given a sample of compositional data trajectories (i.e. sequences of composition measurements along a domain), how can one propose a segmentation procedure leading to homogeneous classes? In other words, our contribution aims at studying statistical methods suited for clustering compositional data, when the observations are constituted by trajectories of compositional data. Observed trajectories are known as “functional data” and several methods have been proposed for their analysis. In particular, methodologies suited for clustering of trajectories are known as Functional Cluster Analysis (FCA) (Ramsay and Silverman, 2005). However, FCA techniques have not been extended to analyse compositional data trajectories. To this aim, FCA clustering techniques have to be adapted by using a suitable algebra for compositions (Aitchison, 1986). In this work, we propose a methodology consisting in a preliminary smoothing of compositional trajectories, followed by the construction of suitable metrics needed for both partitional and hierarchical clustering. A simulation study is performed in order to check the proposed methodologies. The quality of the obtained results is assessed by means of several indices (Halkidi et al., 2001). Finally, an environmental application is developed. The methodologies are applied to a real dataset containing measurements of particulate matter vertical profile compositions for different days. The aim of the application is to detect typical behaviours (clusters) characterizing the vertical profiles of particulate matter compositions.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.