The widespread adoption of Internet of Things (IoT) motivated the emergence of mixed workloads in smart cities, where fast arriving geo-referenced big data streams are joined with archive tables, aiming at enriching streams with descriptive attributes that enable insightful analytics. Applications are now relying on finding, in real-time, to which geographical regions data streaming tuples belong. This problem requires a computationally intensive stream-static join for joining a dynamic stream with a disk-resident static table. In addition, the time-varying nature of fluctuation in geospatial data arriving online calls for an approximate solution that can trade-off QoS constraints while ensuring that the system survives sudden spikes in data loads. In this paper, we present SpatialSSJP, an adaptive spatial-aware approximate query processing system that specifically focuses on stream-static joins in a way that guarantees achieving an agreed set of Quality-of-Service goals and maintains geo-statistics of stateful online aggregations over stream-static join results. SpatialSSJP employs a state-of-art stratified-like sampling design to select well-balanced representative geospatial data stream samples and serve them to a stream-static geospatial join operator downstream. We implemented a prototype atop Spark Structured Streaming. Our extensive evaluations on big real datasets show that our system can survive and mitigate harsh join workloads and outperform state-of-art baselines by significant magnitudes, without risking rigorous error bounds in terms of the accuracy of the output results. SpatialSSJP achieves a relative accuracy gain against plain Spark joins of approximately 10% in worst cases but reaching up to 50% in best case scenarios.

SpatialSSJP: QoS-Aware Adaptive Approximate Stream-Static Spatial Join Processor / Jawarneh, Isam Mashhour Al; Bellavista, Paolo; Corradi, Antonio; Foschini, Luca; Montanari, Rebecca. - In: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. - ISSN 1045-9219. - STAMPA. - 35:1(2024), pp. 73-88. [10.1109/tpds.2023.3330669]

SpatialSSJP: QoS-Aware Adaptive Approximate Stream-Static Spatial Join Processor

Jawarneh, Isam Mashhour Al;Bellavista, Paolo;Corradi, Antonio;Foschini, Luca;Montanari, Rebecca
2024

Abstract

The widespread adoption of Internet of Things (IoT) motivated the emergence of mixed workloads in smart cities, where fast arriving geo-referenced big data streams are joined with archive tables, aiming at enriching streams with descriptive attributes that enable insightful analytics. Applications are now relying on finding, in real-time, to which geographical regions data streaming tuples belong. This problem requires a computationally intensive stream-static join for joining a dynamic stream with a disk-resident static table. In addition, the time-varying nature of fluctuation in geospatial data arriving online calls for an approximate solution that can trade-off QoS constraints while ensuring that the system survives sudden spikes in data loads. In this paper, we present SpatialSSJP, an adaptive spatial-aware approximate query processing system that specifically focuses on stream-static joins in a way that guarantees achieving an agreed set of Quality-of-Service goals and maintains geo-statistics of stateful online aggregations over stream-static join results. SpatialSSJP employs a state-of-art stratified-like sampling design to select well-balanced representative geospatial data stream samples and serve them to a stream-static geospatial join operator downstream. We implemented a prototype atop Spark Structured Streaming. Our extensive evaluations on big real datasets show that our system can survive and mitigate harsh join workloads and outperform state-of-art baselines by significant magnitudes, without risking rigorous error bounds in terms of the accuracy of the output results. SpatialSSJP achieves a relative accuracy gain against plain Spark joins of approximately 10% in worst cases but reaching up to 50% in best case scenarios.
2024
SpatialSSJP: QoS-Aware Adaptive Approximate Stream-Static Spatial Join Processor / Jawarneh, Isam Mashhour Al; Bellavista, Paolo; Corradi, Antonio; Foschini, Luca; Montanari, Rebecca. - In: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. - ISSN 1045-9219. - STAMPA. - 35:1(2024), pp. 73-88. [10.1109/tpds.2023.3330669]
Jawarneh, Isam Mashhour Al; Bellavista, Paolo; Corradi, Antonio; Foschini, Luca; Montanari, Rebecca
File in questo prodotto:
File Dimensione Formato  
SpatialSSJP_QoS-Aware_Adaptive_Approximate_Stream-Static_Spatial_Join_Processor.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 1.85 MB
Formato Adobe PDF
1.85 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/962266
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact