Optimizing Hardware Resource Partitioning and Job Allocations on Modern GPUs under Power Caps

Eishi Arima; Minjoon Kang; Issa Saba; Josef Weidendorfer; Carsten Trinitis; Martin Schulz

Optimizing Hardware Resource Partitioning and Job Allocations on Modern GPUs under Power Caps

Eishi Arima, Minjoon Kang, Issa Saba, Josef Weidendorfer, Carsten Trinitis, Martin Schulz

TL;DR

This work tackles the challenge of underutilization and high power consumption in CPU-GPU HPC systems by proposing a co-scheduling framework that jointly optimizes hardware-level GPU partitioning (via MIG), job allocations, and power budgets under power caps. The authors introduce an offline/online workflow that trains a linear-regression performance model using hardware counters and then solves two optimization problems: maximizing throughput under fairness with a fixed power cap, and maximizing throughput per unit power with fairness. Their model captures both scalability and interference effects through $RPerf_{Appi}(S,P)=C(S,P)\cdot H(F_{Appi})+\sum_{j\neq i}D(S,P)\cdot J(F_{Appj})$, with coefficients learned for each $(S,P)$ configuration. Evaluation on an NVIDIA A100 with MIG demonstrates accurate predictions (average errors ~$9.7\%$ for throughput and $14.5\%$ for fairness) and near-optimal throughput/energy efficiency across diverse workloads, validating the method’s practical potential. The work advances power-aware, hardware-partitioned co-scheduling and paves the way for integration with cluster schedulers like SLURM to optimize resource use in real HPC deployments.

Abstract

CPU-GPU heterogeneous systems are now commonly used in HPC (High-Performance Computing). However, improving the utilization and energy-efficiency of such systems is still one of the most critical issues. As one single program typically cannot fully utilize all resources within a node/chip, co-scheduling (or co-locating) multiple programs with complementary resource requirements is a promising solution. Meanwhile, as power consumption has become the first-class design constraint for HPC systems, such co-scheduling techniques should be well-tailored for power-constrained environments. To this end, the industry recently started supporting hardware-level resource partitioning features on modern GPUs for realizing efficient co-scheduling, which can operate with existing power capping features. For example, NVidia's MIG (Multi-Instance GPU) partitions one single GPU into multiple instances at the granularity of a GPC (Graphics Processing Cluster). In this paper, we explicitly target the combination of hardware-level GPU partitioning features and power capping for power-constrained HPC systems. We provide a systematic methodology to optimize the combination of chip partitioning, job allocations, as well as power capping based on our scalability/interference modeling while taking a variety of aspects into account, such as compute/memory intensity and utilization in heterogeneous computational resources (e.g., Tensor Cores). The experimental result indicates that our approach is successful in selecting a near optimal combination across multiple different workloads.

Optimizing Hardware Resource Partitioning and Job Allocations on Modern GPUs under Power Caps

TL;DR

, with coefficients learned for each

configuration. Evaluation on an NVIDIA A100 with MIG demonstrates accurate predictions (average errors ~

for throughput and

for fairness) and near-optimal throughput/energy efficiency across diverse workloads, validating the method’s practical potential. The work advances power-aware, hardware-partitioned co-scheduling and paves the way for integration with cluster schedulers like SLURM to optimize resource use in real HPC deployments.

Abstract

Paper Structure (26 sections, 5 equations, 11 figures, 6 tables)

This paper contains 26 sections, 5 equations, 11 figures, 6 tables.

Introduction
Background
Co-scheduling and Power Management on HPC Systems
MIG: Hardware-Level Concurrency Support on Modern GPUs
Observations
Scalability Observations
Co-scheduling Throughput
Optimizations
Workflow Overview
Problem Setups and Formulations
Linear Regression Performance Modeling
Evaluation
Evaluation Setup
Evaluation Environment
Workloads
...and 11 more sections

Figures (11)

Figure 1: Our Assuming HPC System and Our Scope
Figure 2: MIG with Private LLC/HBM Option
Figure 3: MIG with Shared LLC/HBM Option
Figure 4: Scalability Observations for Different Partitioning Options across Different Benchmarks (Power Cap: 250[W])
Figure 6: Impact of Resource Partitioning/Allocations on Co-scheduling Throughput (Power Cap: 250[W])
...and 6 more figures

Optimizing Hardware Resource Partitioning and Job Allocations on Modern GPUs under Power Caps

TL;DR

Abstract

Optimizing Hardware Resource Partitioning and Job Allocations on Modern GPUs under Power Caps

Authors

TL;DR

Abstract

Table of Contents

Figures (11)