We demonstrate SparkTune, a tool that supports the evaluation and tuning of Spark SQL workloads from multiple perspectives. Unlike Spark SQL's optimizer, which mainly relies on a rule-based model, SparkTune adopts a cost-based model for SQL queries; this enables the accurate estimation of execution times and the identification of cost and complexity factors in a user-defined workload. The estimate is based on the cluster configuration, the database statistics (both automatically retrieved by the tool) and the resources allocated to the workload. Thus, for any given cluster, database and workload, SparkTune is able to identify the best cluster configuration to run the workload, to estimate the price to run it on a cloud platform while evaluating the performance/price trade-off, and more. SparkTune turns the cluster tuning efforts from manual and qualitative to automatic, optimized and quantitative.

SparkTune: tuning Spark SQL through query cost modeling / Enrico Gallinucci; Matteo Golfarelli. - ELETTRONICO. - (2019), pp. 546-549. (Intervento presentato al convegno 22nd International Conference on Extending Database Technology (EDBT) tenutosi a Lisbon, Portugal nel March 26-29, 2019) [10.5441/002/edbt.2019.52].

SparkTune: tuning Spark SQL through query cost modeling

Enrico Gallinucci;Matteo Golfarelli
2019

Abstract

We demonstrate SparkTune, a tool that supports the evaluation and tuning of Spark SQL workloads from multiple perspectives. Unlike Spark SQL's optimizer, which mainly relies on a rule-based model, SparkTune adopts a cost-based model for SQL queries; this enables the accurate estimation of execution times and the identification of cost and complexity factors in a user-defined workload. The estimate is based on the cluster configuration, the database statistics (both automatically retrieved by the tool) and the resources allocated to the workload. Thus, for any given cluster, database and workload, SparkTune is able to identify the best cluster configuration to run the workload, to estimate the price to run it on a cloud platform while evaluating the performance/price trade-off, and more. SparkTune turns the cluster tuning efforts from manual and qualitative to automatic, optimized and quantitative.
2019
Advances in Database Technology - EDBT 2019, 22th International Conference on Extending Database Technology, Proceedings
546
549
SparkTune: tuning Spark SQL through query cost modeling / Enrico Gallinucci; Matteo Golfarelli. - ELETTRONICO. - (2019), pp. 546-549. (Intervento presentato al convegno 22nd International Conference on Extending Database Technology (EDBT) tenutosi a Lisbon, Portugal nel March 26-29, 2019) [10.5441/002/edbt.2019.52].
Enrico Gallinucci; Matteo Golfarelli
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/733372
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact