Mapping Spiking Neural Networks to Heterogeneous Crossbar Architectures using Integer Linear Programming
Devin Pohl, Aaron Young, Kazi Asifuzzaman, Narasinga Miniskar, Jeffrey Vetter
TL;DR
This paper tackles the challenge of efficiently mapping sparse Spiking Neural Networks (SNNs) onto multi-crossbar Memristor Crossbar Architectures (MCAs) with heterogeneous sizes. It introduces Integer Linear Programming (ILP) formulations that support arbitrary crossbar heterogeneity, explicitly model axon-sharing, and optimize area, inter-crossbar routing, and dynamic traffic using Profile Guided Optimization (PGO). Key results show 16.7–27.6% area reductions for homogeneous MCAs and a further 66.9–72.7% improvement when employing heterogeneous crossbars, plus 11.9–26.4% routing reductions and 0.5–14.8% spike reductions with PGO, all while achieving faster solver times. The work demonstrates that combining heterogenous MCAs with ILP-driven mapping and PGO yields substantial gains in efficiency, enabling scalable, energy-efficient neuromorphic hardware for large-scale SNN inference.
Abstract
Advances in novel hardware devices and architectures allow Spiking Neural Network evaluation using ultra-low power, mixed-signal, memristor crossbar arrays. As individual network sizes quickly scale beyond the dimensional capabilities of single crossbars, networks must be mapped onto multiple crossbars. Crossbar sizes within modern Memristor Crossbar Architectures are determined predominately not by device technology but by network topology; more, smaller crossbars consume less area thanks to the high structural sparsity found in larger, brain-inspired SNNs. Motivated by continuing increases in SNN sparsity due to improvements in training methods, we propose utilizing heterogeneous crossbar sizes to further reduce area consumption. This approach was previously unachievable as prior compiler studies only explored solutions targeting homogeneous MCAs. Our work improves on the state-of-the-art by providing Integer Linear Programming formulations supporting arbitrarily heterogeneous architectures. By modeling axonal interactions between neurons our methods produce better mappings while removing inhibitive a priori knowledge requirements. We first show a 16.7-27.6% reduction in area consumption for square-crossbar homogeneous architectures. Then, we demonstrate 66.9-72.7% further reduction when using a reasonable configuration of heterogeneous crossbar dimensions. Next, we present a new optimization formulation capable of minimizing the number of inter-crossbar routes. When applied to solutions already near-optimal in area an 11.9-26.4% routing reduction is observed without impacting area consumption. Finally, we present a profile-guided optimization capable of minimizing the number of runtime spikes between crossbars. Compared to the best-area-then-route optimized solutions we observe a further 0.5-14.8% inter-crossbar spike reduction while requiring 1-3 orders of magnitude less solver time.
