Implementation and Analysis of Thermometer Encoding in DWN FPGA Accelerators
Michael Mecik, Martin Kumm
TL;DR
The paper analyzes the hardware cost of integrating thermometer encoding into differential weightless neural networks (DWN) on FPGA accelerators. It introduces a complete hardware generator that includes the thermometer encoder, LUT layer, and classification logic, enabling a full resource evaluation on the Jet Substructure Classification task. Key findings show encoding can substantially inflate LUT usage, but post-training quantization and fine-tuning can mitigate this overhead and preserve accuracy. The work highlights the need for encoding-aware co-design and provides guidance on how encoder size, LUT counts, and popcount logic shape overall hardware efficiency across model scales.
Abstract
Fully parallel neural network accelerators on field-programmable gate arrays (FPGAs) offer high throughput for latency-critical applications but face hardware resource constraints. Weightless neural networks (WNNs) efficiently replace arithmetic with logic-based inference. Differential weightless neural networks (DWN) further optimize resource usage by learning connections between encoders and LUT layers via gradient-based training. However, DWNs rely on thermometer encoding, and the associated hardware cost has not been fully evaluated. We present a DWN hardware generator that includes thermometer encoding explicitly. Experiments on the Jet Substructure Classification (JSC) task show that encoding can increase LUT usage by up to 3.20$\times$, dominating costs in small networks and highlighting the need for encoding-aware hardware design in DWN accelerators.
