All-to-all reconfigurability with sparse and higher-order Ising machines

Srijan Nikhar; Sidharth Kannan; Navid Anjum Aadit; Shuvro Chowdhury; Kerem Y. Camsari

All-to-all reconfigurability with sparse and higher-order Ising machines

Srijan Nikhar, Sidharth Kannan, Navid Anjum Aadit, Shuvro Chowdhury, Kerem Y. Camsari

TL;DR

A sparse, multiplexed, and reconfigurable p-bit Ising Machine on Field-Programmable Gate Arrays, using adaptive parallel tempering and higher-order interactions to achieve competitive performance on the 3-Regular 3-XORSAT problem.

Abstract

Domain-specific hardware to solve computationally hard optimization problems has generated tremendous excitement. Here, we evaluate probabilistic bit (p-bit) based Ising Machines (IM) on the 3-regular 3-Exclusive OR Satisfiability (3R3X), as a representative hard optimization problem. We first introduce a multiplexed architecture that emulates all-to-all network functionality while maintaining highly parallelized chromatic Gibbs sampling. We implement this architecture in single Field-Programmable Gate Arrays (FPGA) and show that running the adaptive parallel tempering algorithm demonstrates competitive algorithmic and prefactor advantages over alternative IMs by D-Wave, Toshiba, and Fujitsu. We also implement higher-order interactions that lead to better prefactors without changing algorithmic scaling for the XORSAT problem. Even though FPGA implementations of p-bits are still not quite as fast as the best possible greedy algorithms accelerated on Graphics Processing Units (GPU), scaled magnetic versions of p-bit IMs could lead to orders of magnitude improvements over the state of the art for generic optimization.

All-to-all reconfigurability with sparse and higher-order Ising machines

TL;DR

Abstract

Paper Structure (14 sections, 9 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 9 equations, 8 figures, 1 table, 1 algorithm.

Introduction
Results
Background on p-bits and XORSAT
Adaptive Parallel Tempering
All-to-All Reconfigurability via Sparse Network Multiplexing
Architecture-Enabled Scaling Advantage of p-computers
Higher-Order Interactions
p-computer Results on the XORSAT Challenge
Assumptions and Qualifications
Discussion
Methods
Third-order instance generation
Bipolar to Binary Weight Conversion
Calculation of Optimal Median Time to Solution

Figures (8)

Figure 1: Physics-inspired probabilistic computing: a) Many-body physics of interacting particles have local connectivity (sparse), asynchronous dynamics (clockless), and massive parallelism. b) Asynchronous p-computers take inspiration from the physics of (a). Each p-bit asynchronously (shown with phase-shifted clocks) receives input from local neighbors followed by probabilistic activation. c) The 3-regular 3-XORSAT (3R3X) problem hen2019equationKowalsky2022 under study. d) We use the adaptive parallel tempering (APT) algorithm aadit2023acceleratingmohseni2021nonequilibrium on FPGA-based p-computers to solve the 3R3X problem. APT uses replicas of the original network, operated at different computational temperatures where neighboring replicas swap their states based on a Metropolis criterion at regular intervals.
Figure 2: Multiplexed all-to-all reconfigurable master graph approach: a) For each problem size ($n$), multiple graph-colored sparse instances of the 3R3X problem are combined to form a dense master graph. b) In our architecture, neighbors and clocks for each p-bit are multiplexed using an instance selector. c) Pair-wise swap acceptance rates show roughly equal probability in both master graph (FPGA) and all-to-all graph (CPU), obtained from APT across 8 replicas for $n = 80$. d) 20 instances with the highest success probabilities ($p_i$) are shown for $n=$ 80. All $p_i$ values are computed from 1000 independent runs. e) Mean $p_i$ as a function of swap attempts at varying $n$. (c-e) establish equivalence between our master graph approach (FPGA) and all-to-all graph (CPU).
Figure 3: Performance comparison between CPU and FPGA implementations of second-order p-computers: a) The algorithmic complexity of the 3R3X problem as a function of Monte Carlo (MC) sweeps to the solution, independent of all-to-all (CPU) or master graph (FPGA) implementation. b) The average time to complete one MC sweep is shown for both CPU and FPGA. For CPU, we observe an $\mathcal{O}(n)$ dependence as in Kowalsky2022. For the master graph, we observe an $\mathcal{O}(1)$ dependence. c) Multiplying (a) with (b) yields time to solution (TTS), preserving the $\mathcal{O}(n)$ improvement over the CPU (see chowdhury2023accelerated for a similar analysis).
Figure 4: Validating hypergraph coloring of fully connected XORSAT clause: Comparison among strong hypergraph coloring, weak hypergraph coloring, and Boltzmann Law is shown using $10^5$ samples. In a weak coloring, where two p-bits in the same clause can be the same color, the network does not reach the Boltzmann distribution.
Figure 5: Hardware architecture for higher order master graph: Weights are selected using two multiplexers, one identical to the weight selector in the 2-body design and the other controlled by two neighboring spins passed through an AND gate. The binary nature of p-bits greatly simplifies the higher-order interactions, avoiding multiplications.
...and 3 more figures

All-to-all reconfigurability with sparse and higher-order Ising machines

TL;DR

Abstract

All-to-all reconfigurability with sparse and higher-order Ising machines

Authors

TL;DR

Abstract

Table of Contents

Figures (8)