Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance

Arya Fayyazi; Mehdi Kamal; Massoud Pedram

Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance

Arya Fayyazi, Mehdi Kamal, Massoud Pedram

TL;DR

DCOC tackles the challenge of efficiently mapping DNN workloads to heterogeneous accelerators by introducing a Dynamic Co-Optimization Compiler that combines three specialized agents in a Centralized Training with Decentralized Execution (CTDE) multi-agent reinforcement learning framework with a Confidence Sampling mechanism. The approach jointly optimizes hardware architecture and software configurations, guided by a cost model that serves as a surrogate for runtime and is updated by a central critic, while enforcing hardware/software constraints via penalties. Empirically, DCOC achieves substantial throughput improvements (up to 37.95% in the abstract and ~1.17× on average in experiments) and reduces optimization time by up to 42.2% across a range of models on a VTA++–like platform, outperforming AutoTVM and CHAMELEON. The method advances practical DNN accelerator deployment by efficiently navigating the hardware/software co-design space and accelerating compilation without sacrificing peak performance.

Abstract

This paper introduces a novel Dynamic Co-Optimization Compiler (DCOC), which employs an adaptive Multi-Agent Reinforcement Learning (MARL) framework to enhance the efficiency of mapping machine learning (ML) models, particularly Deep Neural Networks (DNNs), onto diverse hardware platforms. DCOC incorporates three specialized actor-critic agents within MARL, each dedicated to different optimization facets: one for hardware and two for software. This cooperative strategy results in an integrated hardware/software co-optimization approach, improving the precision and speed of DNN deployments. By focusing on high-confidence configurations, DCOC effectively reduces the search space, achieving remarkable performance over existing methods. Our results demonstrate that DCOC enhances throughput by up to 37.95% while reducing optimization time by up to 42.2% across various DNN models, outperforming current state-of-the-art frameworks.

Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance

TL;DR

Abstract

Paper Structure (15 sections, 3 equations, 6 figures, 5 tables, 2 algorithms)

This paper contains 15 sections, 3 equations, 6 figures, 5 tables, 2 algorithms.

Introduction
Background
CTDE in MARL
Workflow for DNN Compilers
Related Work
Proposed DCOC
Overview of DCOC
MARL Exploration
Cost Model and Central Critic Update
Incorporating Hardware and Software Constraints
Confidence Sampling
Experimental Results
Experimental Setup
End-to-end Evaluation
Conclusion

Figures (6)

Figure 1: Overall search flow of DCOC.
Figure 2: High-level view of MARL Exploration Module. Each Agent has a policy network and, based on the centralized critic feedback, it will do an action in its own environment.
Figure 3: Configurations over time for ResNet-18 model a) before and b) after applying the CS method.
Figure 4: Comparing the achieved throughput of different frameworks over AutoTVM on VTA++.
Figure 5: Comparing the compilation time of different frameworks (The percentages show the speedup of DCOC compared to AutoTVM).
...and 1 more figures

Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance

TL;DR

Abstract

Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance

Authors

TL;DR

Abstract

Table of Contents

Figures (6)