The widespread adoption of ubiquitous IoT edge devices and modern telemetry spewing out unprecedented avalanches of spatially-tagged datasets that if could interactively be explored would offer deep insights into interesting natural phenomena, which might remain otherwise illusive. Online application of spatial queries is expensive, a problem that is further inflated by the fact that we, more than often, do not have access to a full dataset population in non- stationary settings. As a way of coping up, sampling stands out as a natural solution for approximating estimators such as averages and totals of some interesting correlated parameters. In any sampling design, representativeness remains the main issue upon which a method is regarded good or bad. In a loose way, in a spatial context, this means fairly sampling quantities in a way that preserves spatial characteristics so as to provide more accurate approximates for spatial query responses. Current big data management systems either do not offer over-the-counter spatial-aware online sampling solutions or, at best, rely on randomness, which causes too many imponderables for an overall estimation. We herein have designed a QoS- spatial-aware online sampling method that outperforms vanilla baselines by statically significant magnitudes. Our method sits atop Apache Spark Structured Streaming's codebase and have been tested against a benchmark that is consisting of millions-records of spatially- augmented dataset.

Spatial-aware approximate big data stream processing

Al Jawarneh Isam Mashhour Hasan;Bellavista P.;Foschini L.;Montanari R.
2019

Abstract

The widespread adoption of ubiquitous IoT edge devices and modern telemetry spewing out unprecedented avalanches of spatially-tagged datasets that if could interactively be explored would offer deep insights into interesting natural phenomena, which might remain otherwise illusive. Online application of spatial queries is expensive, a problem that is further inflated by the fact that we, more than often, do not have access to a full dataset population in non- stationary settings. As a way of coping up, sampling stands out as a natural solution for approximating estimators such as averages and totals of some interesting correlated parameters. In any sampling design, representativeness remains the main issue upon which a method is regarded good or bad. In a loose way, in a spatial context, this means fairly sampling quantities in a way that preserves spatial characteristics so as to provide more accurate approximates for spatial query responses. Current big data management systems either do not offer over-the-counter spatial-aware online sampling solutions or, at best, rely on randomness, which causes too many imponderables for an overall estimation. We herein have designed a QoS- spatial-aware online sampling method that outperforms vanilla baselines by statically significant magnitudes. Our method sits atop Apache Spark Structured Streaming's codebase and have been tested against a benchmark that is consisting of millions-records of spatially- augmented dataset.
2019
2019 IEEE Global Communications Conference, GLOBECOM 2019 - Proceedings
1
6
Al Jawarneh Isam Mashhour Hasan; Bellavista P.; Foschini L.; Montanari R.
File in questo prodotto:
File Dimensione Formato  
9014291.pdf

Open Access dal 28/08/2021

Tipo: Postprint
Licenza: Licenza per accesso libero gratuito
Dimensione 2.23 MB
Formato Adobe PDF
2.23 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/764219
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 7
social impact