Table of Contents
Fetching ...

GauS: Differentiable Scheduling Optimization via Gaussian Reparameterization

Yaohui Cai, Vesal Bakhtazad, Cunxi Yu, Zhiru Zhang

TL;DR

A novel differentiable framework, GauS, is proposed that models operator scheduling as a stochastic relaxation using Gaussian distributions, which fully utilize modern parallel computing devices like GPUs, and achieves Pareto-optimal results.

Abstract

Efficient operator scheduling is a fundamental challenge in software compilation and hardware synthesis. While recent differentiable approaches have sought to replace traditional ones like exact solvers or heuristics with gradient-based search, they typically rely on categorical distributions that fail to capture the ordinal nature of time and suffer from a parameter space that scales poorly. In this paper, we propose a novel differentiable framework, GauS, that models operator scheduling as a stochastic relaxation using Gaussian distributions, which fully utilize modern parallel computing devices like GPUs. By representing schedules as continuous Gaussian variables, we successfully capture the ordinal nature of time and reduce the optimization space by orders of magnitude. Our method is highly flexible to represent various objectives and constraints, which provides the first differentiable formulation for the complex pipelined scheduling problem. We evaluate our method on a range of benchmarks, demonstrating that Gaus achieves Pareto-optimal results.

GauS: Differentiable Scheduling Optimization via Gaussian Reparameterization

TL;DR

A novel differentiable framework, GauS, is proposed that models operator scheduling as a stochastic relaxation using Gaussian distributions, which fully utilize modern parallel computing devices like GPUs, and achieves Pareto-optimal results.

Abstract

Efficient operator scheduling is a fundamental challenge in software compilation and hardware synthesis. While recent differentiable approaches have sought to replace traditional ones like exact solvers or heuristics with gradient-based search, they typically rely on categorical distributions that fail to capture the ordinal nature of time and suffer from a parameter space that scales poorly. In this paper, we propose a novel differentiable framework, GauS, that models operator scheduling as a stochastic relaxation using Gaussian distributions, which fully utilize modern parallel computing devices like GPUs. By representing schedules as continuous Gaussian variables, we successfully capture the ordinal nature of time and reduce the optimization space by orders of magnitude. Our method is highly flexible to represent various objectives and constraints, which provides the first differentiable formulation for the complex pipelined scheduling problem. We evaluate our method on a range of benchmarks, demonstrating that Gaus achieves Pareto-optimal results.
Paper Structure (32 sections, 25 equations, 14 figures, 1 table, 3 algorithms)

This paper contains 32 sections, 25 equations, 14 figures, 1 table, 3 algorithms.

Figures (14)

  • Figure 1: Impact of optimization objectives on scheduling --- Impact of optimization objectives on scheduling. Nodes $v_i$ (operators) show resource requirements $r_i$ (number of dots) and bitwidths of storage units required $b_i$ (number of arrows). Max depth ($D$) sets to $3$. Left schedule minimizes resources ($\mathcal{L}_{res}=4, \mathcal{L}_{mem}=3$); right schedule minimizes memory footprint ($\mathcal{L}_{res}=5, \mathcal{L}_{mem}=2$).
  • Figure 2: Illustration of Gaussian reparameterization --- Each operator $v_i$ is parameterized as an independent continuous random variable $X_i \sim \mathcal{N}(\mu_i, \sigma_i^2)$ over the scheduling timeline. The means $\mu_i$ represents the expected execution step, while the standard deviation $\sigma_i$ reflects the uncertainty level.
  • Figure 3: \ref{['formA']}Latency constrained, resource and communication optimization --- Problem instance size ($|V|$) increases from left to right. A 15-minute time limit is applied to all methods. Relative solution quality is represented as the ratio between the methods' solution quality and GauS, where lower values correspond to higher solution quality. Light crosses (labeled as inf) denote no feasible solution or quality exceeding $4\times$ GauS performance; filled crosses indicate GS-Schedule exceeded available CUDA memory (OOM). On shared feasible instances, GauS achieves a $\textbf{71.8\%}$ geometric mean improvement over GS-Schedule.
  • Figure 4: Overview of benchmark statistics.
  • Figure 5: \ref{['formB']}Latency constrained, memory footprint optimization --- A 15-minute time limit is applied to all methods. Light crosses denote no feasible solution or quality exceeding $4\times$ GauS performance.
  • ...and 9 more figures