RL-MUL 2.0: Multiplier Design Optimization with Parallel Deep Reinforcement Learning and Space Reduction

Dongsheng Zuo; Jiadong Zhu; Yikang Ouyang; Yuzhe Ma

RL-MUL 2.0: Multiplier Design Optimization with Parallel Deep Reinforcement Learning and Space Reduction

Dongsheng Zuo, Jiadong Zhu, Yikang Ouyang, Yuzhe Ma

TL;DR

RL-MUL 2.0 introduces a reinforcement-learning framework for automatic multiplier design optimization in a vast design space. By encoding compressor-tree structure with matrix and tensor representations and employing a Pareto-driven reward, it optimizes area and delay while enabling parallel RL and search-space pruning, extended to fused MACs. The approach yields Pareto-dominant multipliers and MACs that outperform legacy and heuristic baselines across bit-widths and synthesis environments, and improves performance when integrated into processing-element arrays. The work demonstrates practical impact for hardware design automation, offering scalable, data-driven optimization for complex datapath blocks and suggesting extensions to larger datapath components.

Abstract

Multiplication is a fundamental operation in many applications, and multipliers are widely adopted in various circuits. However, optimizing multipliers is challenging due to the extensive design space. In this paper, we propose a multiplier design optimization framework based on reinforcement learning. We utilize matrix and tensor representations for the compressor tree of a multiplier, enabling seamless integration of convolutional neural networks as the agent network. The agent optimizes the multiplier structure using a Pareto-driven reward customized to balance area and delay. Furthermore, we enhance the original framework with parallel reinforcement learning and design space pruning techniques and extend its capability to optimize fused multiply-accumulate (MAC) designs. Experiments conducted on different bit widths of multipliers demonstrate that multipliers produced by our approach outperform all baseline designs in terms of area, power, and delay. The performance gain is further validated by comparing the area, power, and delay of processing element arrays using multipliers from our approach and baseline approaches.

RL-MUL 2.0: Multiplier Design Optimization with Parallel Deep Reinforcement Learning and Space Reduction

TL;DR

Abstract

RL-MUL 2.0: Multiplier Design Optimization with Parallel Deep Reinforcement Learning and Space Reduction

Authors

TL;DR

Abstract

Table of Contents

Figures (13)