Table of Contents
Fetching ...

HPC Application Parameter Autotuning on Edge Devices: A Bandit Learning Approach

Abrar Hossain, Abdel-Hameed A. Badawy, Mohammad A. Islam, Tapasya Patki, Kishwar Ahmed

TL;DR

This work tackles autotuning HPC configurations on resource-constrained edge devices by introducing LASP, a lightweight online autotuner built on a multi-armed bandit framework. LASP uses low-fidelity edge evaluations to guide high-fidelity HPC parameter choices via a UCB-based exploration-exploitation strategy and a reward that jointly considers execution time and power with user-defined weights $\alpha$ and $\beta$. Across four HPC applications (Lulesh, Kripke, Clomp, Hypre), LASP achieves meaningful performance gains with low regret and demonstrated robustness to measurement noise, illustrating the viability of edge-proxy autotuning for efficient HPC deployment. The approach enables dynamic adaptation to changing workloads and environments, offering a practical route to reducing autotuning costs while maintaining performance on edge-assisted HPC systems.

Abstract

The growing necessity for enhanced processing capabilities in edge devices with limited resources has led us to develop effective methods for improving high-performance computing (HPC) applications. In this paper, we introduce LASP (Lightweight Autotuning of Scientific Application Parameters), a novel strategy designed to address the parameter search space challenge in edge devices. Our strategy employs a multi-armed bandit (MAB) technique focused on online exploration and exploitation. Notably, LASP takes a dynamic approach, adapting seamlessly to changing environments. We tested LASP with four HPC applications: Lulesh, Kripke, Clomp, and Hypre. Its lightweight nature makes it particularly well-suited for resource-constrained edge devices. By employing the MAB framework to efficiently navigate the search space, we achieved significant performance improvements while adhering to the stringent computational limits of edge devices. Our experimental results demonstrate the effectiveness of LASP in optimizing parameter search on edge devices.

HPC Application Parameter Autotuning on Edge Devices: A Bandit Learning Approach

TL;DR

This work tackles autotuning HPC configurations on resource-constrained edge devices by introducing LASP, a lightweight online autotuner built on a multi-armed bandit framework. LASP uses low-fidelity edge evaluations to guide high-fidelity HPC parameter choices via a UCB-based exploration-exploitation strategy and a reward that jointly considers execution time and power with user-defined weights and . Across four HPC applications (Lulesh, Kripke, Clomp, Hypre), LASP achieves meaningful performance gains with low regret and demonstrated robustness to measurement noise, illustrating the viability of edge-proxy autotuning for efficient HPC deployment. The approach enables dynamic adaptation to changing workloads and environments, offering a practical route to reducing autotuning costs while maintaining performance on edge-assisted HPC systems.

Abstract

The growing necessity for enhanced processing capabilities in edge devices with limited resources has led us to develop effective methods for improving high-performance computing (HPC) applications. In this paper, we introduce LASP (Lightweight Autotuning of Scientific Application Parameters), a novel strategy designed to address the parameter search space challenge in edge devices. Our strategy employs a multi-armed bandit (MAB) technique focused on online exploration and exploitation. Notably, LASP takes a dynamic approach, adapting seamlessly to changing environments. We tested LASP with four HPC applications: Lulesh, Kripke, Clomp, and Hypre. Its lightweight nature makes it particularly well-suited for resource-constrained edge devices. By employing the MAB framework to efficiently navigate the search space, we achieved significant performance improvements while adhering to the stringent computational limits of edge devices. Our experimental results demonstrate the effectiveness of LASP in optimizing parameter search on edge devices.
Paper Structure (19 sections, 10 equations, 12 figures, 2 tables, 1 algorithm)

This paper contains 19 sections, 10 equations, 12 figures, 2 tables, 1 algorithm.

Figures (12)

  • Figure 1: Framework to leverage edge devices to find the optimal parameters to execute applications on HPC clusters.
  • Figure 2: Overlap of optimal configurations on low- and high-fidelity setting. (a) The top 20 configurations identified in the low-fidelity setting are compared to the optimal configuration when run on the high-fidelity setting of the target device, and the average distance between them is measured. (b) The number of common configurations out of top 20 configurations for both the low-fidelity and high-fidelity settings.
  • Figure 3: Distribution of execution time for Kripke for all sets of configurations. (a) Tuning only two sets of parameters gives wide variance in the execution time. (b) Distribution of execution time for Kripke for all sets of configurations.
  • Figure 4: Runtime variability of Kripke for different parameters considered independently.
  • Figure 5: Block diagram of the $\mathsf{LASP}$.
  • ...and 7 more figures