ReducedLUT: Table Decomposition with "Don't Care" Conditions
Oliver Cassidy, Marta Andronic, Samuel Coward, George A. Constantinides
TL;DR
The paper tackles the challenge of compressing irregular LUT-based neural networks by introducing ReducedLUT, which injects don't-care conditions into L-LUT entries to boost self-similarity and enable more aggressive decomposition. Building on CompressedLUT, ReducedLUT uses a don’t-care based search, a similarity-driven L-LUT compression, and an exiguity constraint to minimize the number of unique sub-tables. Empirical results on NeuraLUT-based networks show up to 39% reductions in P-LUT utilization with minimal accuracy loss, outperforming baseline methods across multiple benchmarks. The work broadens the applicability of LUT compression to irregular DNN LUTs and offers an open-source toolflow for hardware-efficient NN inference on FPGAs.
Abstract
Lookup tables (LUTs) are frequently used to efficiently store arrays of precomputed values for complex mathematical computations. When used in the context of neural networks, these functions exhibit a lack of recognizable patterns which presents an unusual challenge for conventional logic synthesis techniques. Several approaches are known to break down a single large lookup table into multiple smaller ones that can be recombined. Traditional methods, such as plain tabulation, piecewise linear approximation, and multipartite table methods, often yield inefficient hardware solutions when applied to LUT-based NNs. This paper introduces ReducedLUT, a novel method to reduce the footprint of the LUTs by injecting don't cares into the compression process. This additional freedom introduces more self-similarities which can be exploited using known decomposition techniques. We then demonstrate a particular application to machine learning; by replacing unobserved patterns within the training data of neural network models with don't cares, we enable greater compression with minimal model accuracy degradation. In practice, we achieve up to $1.63\times$ reduction in Physical LUT utilization, with a test accuracy drop of no more than $0.01$ accuracy points.
