Mixed precision accumulation for neural network inference guided by componentwise forward error analysis
El-Mehdi El Arar, Silviu-Ioan Filip, Theo Mary, Elisa Riccietti
TL;DR
This work tackles reducing neural network inference costs by introducing a componentwise forward error analysis that links per-component errors to layer- and activation-function condition numbers. It then derives a practical mixed-precision accumulation strategy: compute layer outputs in low precision, estimate per-component condition numbers, and selectively recompute the most sensitive components in a higher precision to balance accuracy and efficiency. The proposed Algorithm 1, guided by a tunable tolerance, demonstrates favorable cost–accuracy tradeoffs on multilayer perceptrons with ReLU and tanh activations, achieving significant gains over uniform low-precision accumulation and competitive results with higher precision baselines. The study identifies practical limitations such as overflow considerations, τ parameter sensitivity, and dynamic recomputation overhead, and outlines future work including static precision configurations and extensions to convolutional networks and transformers.
Abstract
This work proposes a mathematically founded mixed precision accumulation strategy for the inference of neural networks. Our strategy is based on a new componentwise forward error analysis that explains the propagation of errors in the forward pass of neural networks. Specifically, our analysis shows that the error in each component of the output of a linear layer is proportional to the condition number of the inner product between the weights and the input, multiplied by the condition number of the activation function. These condition numbers can vary widely from one component to the other, thus creating a significant opportunity to introduce mixed precision: each component should be accumulated in a precision inversely proportional to the product of these condition numbers. We propose a numerical algorithm that exploits this observation: it first computes all components in low precision, uses this output to estimate the condition numbers, and recomputes in higher precision only the components associated with large condition numbers. We test our algorithm on various networks and datasets and confirm experimentally that it can significantly improve the cost--accuracy tradeoff compared with uniform precision accumulation baselines.
