Exploring the Potential of Wireless-enabled Multi-Chip AI Accelerators
Emmanuel Irabor, Mariam Musavi, Abhijit Das, Sergi Abadal
TL;DR
The paper examines inter-chiplet data movement bottlenecks (NoP) in scalable multi-chip AI accelerators and proposes wireless interconnects as a flexible complement to wired NoP. It extends the GEMINI workload-mapping framework with a wireless channel to evaluate how wireless links can alleviate bottlenecks for optimally mapped workloads. Through a wireless-enabled extension, decision criteria, and simulator modifications, the study demonstrates that wireless interconnects can deliver meaningful speedups, with performance gains sensitive to load-balancing between wired and wireless planes. The work highlights a viable path to increase throughput and versatility in chiplet-based AI accelerators and informs future design choices for wireless versus wired interconnect trade-offs.
Abstract
The insatiable appetite of Artificial Intelligence (AI) workloads for computing power is pushing the industry to develop faster and more efficient accelerators. The rigidity of custom hardware, however, conflicts with the need for scalable and versatile architectures capable of catering to the needs of the evolving and heterogeneous pool of Machine Learning (ML) models in the literature. In this context, multi-chiplet architectures assembling multiple (perhaps heterogeneous) accelerators are an appealing option that is unfortunately hindered by the still rigid and inefficient chip-to-chip interconnects. In this paper, we explore the potential of wireless technology as a complement to existing wired interconnects in this multi-chiplet approach. Using an evaluation framework from the state-of-the-art, we show that wireless interconnects can lead to speedups of 10% on average and 20% maximum. We also highlight the importance of load balancing between the wired and wireless interconnects, which will be further explored in future work.
