Over the last few years, the context of big data has gained a significant traction due to many factors. While the public cloud model had been deeply studied to face the increasing demand for large-scale data processing capabilities, many organizations are now focusing on the hybrid cloud model, where the classic scenario is enriched with a private (company owned) cloud – e.g., for the management of sensible data. In this work, we propose HyMR, a policy to enable autonomic cloud bursting for clusters of virtual machines operating MapReduce jobs over a hybrid cloud. This policy – together with an infrastructure level system for resource provisioning in hybrid clouds – can be used to face the temporary (or permanent) lack of computational resources on the private cloud, allowing cloud bursting in the context of big data applications. By means of an empirical evaluation of the system scale-up/-down performance, we show that HyMR policy allows the user to significantly reduce the data-processing time, although it is inevitably influenced by the inter-cloud bandwidth.
Loreti, D., Ciampolini, A. (2015). MapReduce over the Hybrid Cloud: a novel Infrastructure Management Policy. Institute of Electrical and Electronics Engineers Inc. [10.1109/UCC.2015.33].
MapReduce over the Hybrid Cloud: a novel Infrastructure Management Policy
LORETI, DANIELA;CIAMPOLINI, ANNA
2015
Abstract
Over the last few years, the context of big data has gained a significant traction due to many factors. While the public cloud model had been deeply studied to face the increasing demand for large-scale data processing capabilities, many organizations are now focusing on the hybrid cloud model, where the classic scenario is enriched with a private (company owned) cloud – e.g., for the management of sensible data. In this work, we propose HyMR, a policy to enable autonomic cloud bursting for clusters of virtual machines operating MapReduce jobs over a hybrid cloud. This policy – together with an infrastructure level system for resource provisioning in hybrid clouds – can be used to face the temporary (or permanent) lack of computational resources on the private cloud, allowing cloud bursting in the context of big data applications. By means of an empirical evaluation of the system scale-up/-down performance, we show that HyMR policy allows the user to significantly reduce the data-processing time, although it is inevitably influenced by the inter-cloud bandwidth.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.