Never Reset Again: A Mathematical Framework for Continual Inference in Recurrent Neural Networks
Bojian Yin, Federico Corradi
TL;DR
The paper tackles state saturation in continual inference for recurrent networks and the drawbacks of hidden-state resets. It introduces a reset-free training objective that blends $L_{CE}$ and $L_{KL}$ into a single $L_{total}$ with a binary mask $m_t$ to handle informative versus noisy steps. The approach preserves hidden-state continuity and gradient flow without explicit resets and is validated across vanilla RNNs, GRUs, SSMs, and SNNs on sequential tasks including Sequential FashionMNIST and Google Speech Commands. Results show reset-free loss achieves comparable or superior accuracy to reset-based methods and provides robust continual inference suitable for streaming and edge applications.
Abstract
Recurrent Neural Networks (RNNs) are widely used for sequential processing but face fundamental limitations with continual inference due to state saturation, requiring disruptive hidden state resets. However, reset-based methods impose synchronization requirements with input boundaries and increase computational costs at inference. To address this, we propose an adaptive loss function that eliminates the need for resets during inference while preserving high accuracy over extended sequences. By combining cross-entropy and Kullback-Leibler divergence, the loss dynamically modulates the gradient based on input informativeness, allowing the network to differentiate meaningful data from noise and maintain stable representations over time. Experimental results demonstrate that our reset-free approach outperforms traditional reset-based methods when applied to a variety of RNNs, particularly in continual tasks, enhancing both the theoretical and practical capabilities of RNNs for streaming applications.
