Scheduling and dispatching tools for high-performance computing (HPC) machines have the key role of mapping jobs to the available resources, trying to maximize performance and quality-of-service (QoS). Allocation and Scheduling in the general case are well-known NP-hard problems, forcing commercial schedulers to adopt greedy approaches to improve performance and QoS. Search-based approaches featuring the exploration of the solution space have seldom been employed in this setting, but mostly applied in off-line scenarios. In this paper, we present the first search-based approach to job allocation and scheduling for HPC machines, working in a production environment. The scheduler is based on Constraint Programming, an effective programming technique for optimization problems. The resulting scheduler is flexible, as it can be easily customized for dealing with heterogeneous resources, user-defined constraints and different metrics. We evaluate our solution both on virtual machines using synthetic workloads, and on the Eurora HPC with production workloads. Tests on a wide range of operating conditions show significant improvements in waitings and QoS in mid-tier HPC machines w.r.t state-of-the-art commercial rule-based dispatchers. Furthermore, we analyze the conditions under which our approach outperforms commercial approaches, to create a portfolio of scheduling algorithms that ensures robustness, flexibility and scalability.

Bridi, T., Bartolini, A., Lombardi, M., Milano, M., Benini, L. (2016). A Constraint Programming Scheduler for Heterogeneous High-Performance Computing Machines. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 27(10), 2781-2794 [10.1109/TPDS.2016.2516997].

A Constraint Programming Scheduler for Heterogeneous High-Performance Computing Machines

BRIDI, THOMAS;BARTOLINI, ANDREA;LOMBARDI, MICHELE;MILANO, MICHELA;BENINI, LUCA
2016

Abstract

Scheduling and dispatching tools for high-performance computing (HPC) machines have the key role of mapping jobs to the available resources, trying to maximize performance and quality-of-service (QoS). Allocation and Scheduling in the general case are well-known NP-hard problems, forcing commercial schedulers to adopt greedy approaches to improve performance and QoS. Search-based approaches featuring the exploration of the solution space have seldom been employed in this setting, but mostly applied in off-line scenarios. In this paper, we present the first search-based approach to job allocation and scheduling for HPC machines, working in a production environment. The scheduler is based on Constraint Programming, an effective programming technique for optimization problems. The resulting scheduler is flexible, as it can be easily customized for dealing with heterogeneous resources, user-defined constraints and different metrics. We evaluate our solution both on virtual machines using synthetic workloads, and on the Eurora HPC with production workloads. Tests on a wide range of operating conditions show significant improvements in waitings and QoS in mid-tier HPC machines w.r.t state-of-the-art commercial rule-based dispatchers. Furthermore, we analyze the conditions under which our approach outperforms commercial approaches, to create a portfolio of scheduling algorithms that ensures robustness, flexibility and scalability.
2016
Bridi, T., Bartolini, A., Lombardi, M., Milano, M., Benini, L. (2016). A Constraint Programming Scheduler for Heterogeneous High-Performance Computing Machines. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 27(10), 2781-2794 [10.1109/TPDS.2016.2516997].
Bridi, Thomas; Bartolini, Andrea; Lombardi, Michele; Milano, Michela; Benini, Luca
File in questo prodotto:
File Dimensione Formato  
paper3.pdf

accesso aperto

Tipo: Postprint
Licenza: Licenza per accesso libero gratuito
Dimensione 1.97 MB
Formato Adobe PDF
1.97 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/571271
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 24
  • ???jsp.display-item.citation.isi??? 17
social impact