ASA -- The Adaptive Scheduling Algorithm

Abel Souza; Kristiaan Pelckmans; Devarshi Ghoshal; Lavanya Ramakrishnan; Johan Tordsson

ASA -- The Adaptive Scheduling Algorithm

Abel Souza, Kristiaan Pelckmans, Devarshi Ghoshal, Lavanya Ramakrishnan, Johan Tordsson

TL;DR

The paper tackles prolonged queue waits in HPC batch systems for data-intensive scientific workflows by introducing ASA, an adaptive scheduling algorithm that learns queue waiting times online and proactively submits resource changes to reduce inter-stage waiting. ASA uses a reinforcement-learning-inspired, convergence-proven framework that maintains a distribution over a fixed set of waiting-time alternatives and updates it as workflow stages execute. Real-world experiments across two supercomputers and three representative workflows show ASA achieving near-optimal resource utilization while delivering substantial reductions in average workflow queue waiting times (up to about 10%) and makespan (around 2%), demonstrating robust performance under queue workload variability. The proposed Mesos-based Unified View and proactive scheduling library enable WMS to operate over a global resource pool, offering fault tolerance and elasticity while maintaining workflow ordering and QoS constraints, with promising implications for scalable, low-latency scientific data processing.

Abstract

In High Performance Computing (HPC) infrastructures, the control of resources by batch systems can lead to prolonged queue waiting times and adverse effects on the overall execution times of applications, particularly in data-intensive and low-latency workflows where efficient processing hinges on resource planning and timely allocation. Allocating the maximum capacity upfront ensures the fastest execution but results in spare and idle resources, extended queue waits, and costly usage. Conversely, dynamic allocation based on workflow stage requirements optimizes resource usage but may negatively impact the total workflow makespan. To address these issues, we introduce ASA, the Adaptive Scheduling Algorithm. ASA is a novel, convergence-proven scheduling technique that minimizes jobs inter-stage waiting times by estimating the queue waiting times to proactively submit resource change requests ahead of time. It strikes a balance between exploration and exploitation, considering both learning (waiting times) and applying learnt insights. Real-world experiments over two supercomputers centers with scientific workflows demonstrate ASA's effectiveness, achieving near-optimal resource utilization and accuracy, with up to 10% and 2% reductions in average workflow queue waiting times and makespan, respectively.

ASA -- The Adaptive Scheduling Algorithm

TL;DR

Abstract

Paper Structure (20 sections, 1 theorem, 4 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 20 sections, 1 theorem, 4 equations, 9 figures, 2 tables, 1 algorithm.

INTRODUCTION
Background and Related Work
Related Work
Scheduling Tradeoffs for Scientific Workflows
Challenge: Waiting Time Estimation
ASA: The Adaptive Scheduling Architecture
Architecture
Algorithm
EVALUATION
Metrics
Computing Systems
Applications
Convergence Results
Sensitivity Analysis
Makespan Results
...and 5 more sections

Key Result

Theorem 1

Let $\theta=(\theta_1, \dots, \theta_m)\in\mathbb{R}^m$ be a fixed, given collection of waiting time alternatives amongst which to choose. Let the ASA algorithm run on a sequence of $t$ processes, and let $\eta(t)$ denote the number of mini-batches created by the algorithm as of time $t$. Then for a

Figures (9)

Figure 1: Excerpt from the Montage scientific workflow, an image mosaic software employed by NASA berriman2004montage. Different colors in the graph represent distinct sets of tasks within a stage. Outputs generated in each stage serve as inputs for subsequent stages, ultimately culminating in the final result.
Figure 2: (a) Big Job vs (b) Per-Stage managed resource allocation strategies in HPC. Fig. \ref{['fig:bigjob']}: an unique allocation for the entire workflow duration, with single queue waiting time. Fig. \ref{['fig:per_stage']}: per-stage allocations with only as many resources as required by a particular stage, with extra inter-stage queue waiting times. Note the differences in makespan and resources charging in each case (summation of area(s) under the dashed red lines).
Figure 3: ASA - Architecture managing the physical resources. Tasks (the different shapes in the partitions) from different jobs can access resources from multiple jobs. The unified view layer enables users to apply different scheduling strategies, such as pro-active job submissions.
Figure 4: ASA - Algorithm workflow illustrating two concurrent pro-active submissions (2 and 3) within ongoing stages. Note the per-staged charging and lower workflow makespan.
Figure 5: ASA's estimation convergence over time regarding queue waiting time (dark dashed blue line) with three different sampling policies: Greedy (red dotted line), ASA's default (black line), and ASA tuned (light pink line).
...and 4 more figures

Theorems & Definitions (1)

Theorem 1

ASA -- The Adaptive Scheduling Algorithm

TL;DR

Abstract

ASA -- The Adaptive Scheduling Algorithm

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (1)