Efficient and Mathematically Robust Operations for Certified Neural Networks Inference
Fabien Geyer, Johannes Freitag, Tobias Schulz, Sascha Uhrig
TL;DR
The paper tackles certification-ready NN inference for aviation by comparing IEEE $754$ floating-point and fixed-point representations on FPGA hardware, focusing on numerical robustness and predictable execution. It systematically evaluates rounding modes, summation and dot-product algorithms, and empirically tunes bit-widths using CNNs on MNIST and ResNet18, demonstrating that fixed-point with adequate fractional bits can achieve necessary accuracy with hardware efficiency. Key contributions include a hardware-aware assessment of summation/dot-product strategies, evidence that exact or compensated summation outperforms naive approaches for FP, and the finding that fixed-point often yields the best trade-offs for certified inference. The findings inform practical bit-width selection and hardware design strategies for certification-focused NN inference in safety-critical domains.
Abstract
In recent years, machine learning (ML) and neural networks (NNs) have gained widespread use and attention across various domains, particularly in transportation for achieving autonomy, including the emergence of flying taxis for urban air mobility (UAM). However, concerns about certification have come up, compelling the development of standardized processes encompassing the entire ML and NN pipeline. This paper delves into the inference stage and the requisite hardware, highlighting the challenges associated with IEEE 754 floating-point arithmetic and proposing alternative number representations. By evaluating diverse summation and dot product algorithms, we aim to mitigate issues related to non-associativity. Additionally, our exploration of fixed-point arithmetic reveals its advantages over floating-point methods, demonstrating significant hardware efficiencies. Employing an empirical approach, we ascertain the optimal bit-width necessary to attain an acceptable level of accuracy, considering the inherent complexity of bit-width optimization.
