Table of Contents
Fetching ...

SparseLUT: Sparse Connectivity Optimization for Lookup Table-based Deep Neural Networks

Binglei Lou, Ruilin Wu, Philip Leong

TL;DR

SparseLUT tackles the core challenge of LUT-based DNNs on FPGAs: achieving higher accuracy under fixed per-neuron fan-in constraints. It introduces a non-greedy, connectivity-centric training framework that dynamically prunes and regrows connections to meet a target fan-in $F$, while representing each connection with a trainable magnitude $\theta_k$ and a fixed sign $s_k$. Through a two-phase training regime and a relaxations of matching drop/regrow counts, SparseLUT consistently improves accuracy over random sparsity across multiple LUT-DNN baselines without adding hardware or routing overhead. The method achieves up to 2.13% improvement on MNIST and 0.94% on Jet Substructure Classification, and demonstrates favorable qualitative connectivity patterns and hardware parity with existing designs, highlighting its practical impact for FPGA inference.

Abstract

The deployment of deep neural networks (DNNs) on resource-constrained edge devices such as field-programmable gate arrays (FPGAs) requires a careful balance of latency, power, and resource usage while maintaining high accuracy. Existing Lookup Table (LUT)-based DNNs, including LogicNets, PolyLUT, PolyLUT-Add, and NeuraLUT, exploit native FPGA resources with random sparse connectivity. This paper introduces SparseLUT, a connectivity-centric training technique tailored for LUT-based DNNs. SparseLUT leverages a non-greedy training strategy that prioritizes the pruning of less significant connections and strategically regrows alternative ones, resulting in efficient convergence to the target sparsity. Experimental results show consistent accuracy improvements across benchmarks, including up to a 2.13\% increase on MNIST and a 0.94\% improvement for Jet Substructure Classification compared to random sparsity. This is done without any hardware overhead and achieves state-of-the-art results for LUT-based DNNs.

SparseLUT: Sparse Connectivity Optimization for Lookup Table-based Deep Neural Networks

TL;DR

SparseLUT tackles the core challenge of LUT-based DNNs on FPGAs: achieving higher accuracy under fixed per-neuron fan-in constraints. It introduces a non-greedy, connectivity-centric training framework that dynamically prunes and regrows connections to meet a target fan-in , while representing each connection with a trainable magnitude and a fixed sign . Through a two-phase training regime and a relaxations of matching drop/regrow counts, SparseLUT consistently improves accuracy over random sparsity across multiple LUT-DNN baselines without adding hardware or routing overhead. The method achieves up to 2.13% improvement on MNIST and 0.94% on Jet Substructure Classification, and demonstrates favorable qualitative connectivity patterns and hardware parity with existing designs, highlighting its practical impact for FPGA inference.

Abstract

The deployment of deep neural networks (DNNs) on resource-constrained edge devices such as field-programmable gate arrays (FPGAs) requires a careful balance of latency, power, and resource usage while maintaining high accuracy. Existing Lookup Table (LUT)-based DNNs, including LogicNets, PolyLUT, PolyLUT-Add, and NeuraLUT, exploit native FPGA resources with random sparse connectivity. This paper introduces SparseLUT, a connectivity-centric training technique tailored for LUT-based DNNs. SparseLUT leverages a non-greedy training strategy that prioritizes the pruning of less significant connections and strategically regrows alternative ones, resulting in efficient convergence to the target sparsity. Experimental results show consistent accuracy improvements across benchmarks, including up to a 2.13\% increase on MNIST and a 0.94\% improvement for Jet Substructure Classification compared to random sparsity. This is done without any hardware overhead and achieves state-of-the-art results for LUT-based DNNs.

Paper Structure

This paper contains 16 sections, 1 equation, 7 figures, 4 tables, 2 algorithms.

Figures (7)

  • Figure 1: Neuron computation with fan-in $N$. LUT-DNNs typically select a random subset (size $F \ll N$) of the inputs. SparseLUT is a training scheme to select the inputs while maximizing accuracy.
  • Figure 2: Example iteration showing SparseLUT adjusting neuron connections to reach a target fan-in of 2, introducing new ones to increase the fan-in from 1 to 2 and eliminating connections to reduce the fan-in from 3 to 2. Note the $\epsilon_2$ parameter is typically set to a much smaller value, requiring multiple iterations to deactivate a connection.
  • Figure 3: Illustration of the generalized LUT-DNN architectures with sparse connectivity.
  • Figure 4: Workflow of SparseLUT.
  • Figure 5: Heatmaps of the average weight matrix for the first layer in three sparse modes: Random Sparsity, DeepR$^*$, and SparseLUT, with a Fully Connected mode as a baseline.
  • ...and 2 more figures