The ever-increasing diffusion rate of mobile devices, able to continuously gather sensing data, creates favorable conditions for the development of smart city infrastructures. In this field the analysis of spatial data plays a pivotal role, due to the relevance they assume in urban scenarios. To satisfy this need, the usage of large distributed computing infrastructures comes into play, supported by efficient frameworks, such as Apache Spark, one of the most relevant platforms to date. However, in order to better take advantage of data and computing resources, it is also necessary to have at disposal flexible and easy-to-use specialized instruments, granting domain specific capabilities for the analysis of spatial data. This paper focuses on a novel framework for processing of spatial data called STARK, giving an overview of its functionalities and presenting an in-depth assessment study of its performances when implementing spatial data clustering, namely DBSCAN. In particular, we focus on two implementations, called MR-DBSCAN and NG-DBSCAN. Of the latter we introduced an implementation in STARK, in order to enrich the framework and to test its capabilities.
Bellavista P., Campestri M., Foschini L., Montanari R. (2019). Clustering of Spatial Data with DBSCAN: An Assessment of STARK. Institute of Electrical and Electronics Engineers Inc. [10.1109/ISCC47284.2019.8969654].
Clustering of Spatial Data with DBSCAN: An Assessment of STARK
Bellavista P.;Foschini L.;Montanari R.
2019
Abstract
The ever-increasing diffusion rate of mobile devices, able to continuously gather sensing data, creates favorable conditions for the development of smart city infrastructures. In this field the analysis of spatial data plays a pivotal role, due to the relevance they assume in urban scenarios. To satisfy this need, the usage of large distributed computing infrastructures comes into play, supported by efficient frameworks, such as Apache Spark, one of the most relevant platforms to date. However, in order to better take advantage of data and computing resources, it is also necessary to have at disposal flexible and easy-to-use specialized instruments, granting domain specific capabilities for the analysis of spatial data. This paper focuses on a novel framework for processing of spatial data called STARK, giving an overview of its functionalities and presenting an in-depth assessment study of its performances when implementing spatial data clustering, namely DBSCAN. In particular, we focus on two implementations, called MR-DBSCAN and NG-DBSCAN. Of the latter we introduced an implementation in STARK, in order to enrich the framework and to test its capabilities.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.