Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance
Arya Fayyazi, Mehdi Kamal, Massoud Pedram
TL;DR
DCOC tackles the challenge of efficiently mapping DNN workloads to heterogeneous accelerators by introducing a Dynamic Co-Optimization Compiler that combines three specialized agents in a Centralized Training with Decentralized Execution (CTDE) multi-agent reinforcement learning framework with a Confidence Sampling mechanism. The approach jointly optimizes hardware architecture and software configurations, guided by a cost model that serves as a surrogate for runtime and is updated by a central critic, while enforcing hardware/software constraints via penalties. Empirically, DCOC achieves substantial throughput improvements (up to 37.95% in the abstract and ~1.17× on average in experiments) and reduces optimization time by up to 42.2% across a range of models on a VTA++–like platform, outperforming AutoTVM and CHAMELEON. The method advances practical DNN accelerator deployment by efficiently navigating the hardware/software co-design space and accelerating compilation without sacrificing peak performance.
Abstract
This paper introduces a novel Dynamic Co-Optimization Compiler (DCOC), which employs an adaptive Multi-Agent Reinforcement Learning (MARL) framework to enhance the efficiency of mapping machine learning (ML) models, particularly Deep Neural Networks (DNNs), onto diverse hardware platforms. DCOC incorporates three specialized actor-critic agents within MARL, each dedicated to different optimization facets: one for hardware and two for software. This cooperative strategy results in an integrated hardware/software co-optimization approach, improving the precision and speed of DNN deployments. By focusing on high-confidence configurations, DCOC effectively reduces the search space, achieving remarkable performance over existing methods. Our results demonstrate that DCOC enhances throughput by up to 37.95% while reducing optimization time by up to 42.2% across various DNN models, outperforming current state-of-the-art frameworks.
