FPIA: Field-Programmable Ising Arrays with In-Memory Computing
George Higgins Hutchinson, Ethan Sifferman, Tinish Bhattacharya, Dmitri B. Strukov
TL;DR
The paper tackles scaling Ising Machines to practical, sparse QUBO problems by introducing Field-Programmable Ising Arrays (FPIA), an island-type IMC architecture inspired by FPGA designs. It combines a sparsity-aware weight-partitioning strategy with problem embedding and routing optimization using VPR-based tooling, and employs Bayesian optimization to tune tile parameters. The results show up to $60\times$ area reduction and faster operation over a naive baseline, with detailed analysis of routing, ADC overhead, and memory technologies; the work also discusses algorithm/circuit co-design and potential extensions such as auxiliary variables and heterogeneous IMC blocks. The findings demonstrate that FPIA can significantly improve area efficiency for sparse, high-dimensional QUBOs and highlight practical pathways for further performance gains in IMC-based Ising accelerators.
Abstract
Ising Machine is a promising computing approach for solving combinatorial optimization problems. It is naturally suited for energy-saving and compact in-memory computing implementations with emerging memories. A naïve in-memory computing implementation of a quadratic Ising Machine requires an array of coupling weights that grows quadratically with problem size. However, the resources in such an approach are used inefficiently due to sparsity in practical optimization problems. We first show that this issue can be addressed by partitioning a coupling array into smaller sub-arrays. This technique, however, requires interconnecting subarrays; hence, we developed in-memory computing architecture for quadratic Ising Machines inspired by island-type field programmable gate arrays, which is the main contribution of our paper. We adapt open-source tools to optimize problem embedding and model routing overhead. Modeling results of benchmark problems for the developed architecture show up to 60x area improvement and faster operation than the baseline approach. Finally, we discuss algorithm/circuit co-design techniques for further improvements.
