Surprisal-Driven Feedback in Recurrent Networks
Kamil M Rocki
TL;DR
The paper tackles improving temporal prediction by introducing surprisal-driven feedback, where the misprediction signal from the previous step informs future predictions. It formalizes this feedback within recurrent architectures (including LSTM variants) by injecting a surprisal-derived input into the hidden updates and derives corresponding forward and backward passes. Empirically, the approach achieves 1.37 bits-per-character on enwik8, surpassing several stochastic and deterministic baselines. This work demonstrates the practical value of top-down, misprediction-based signals for enhancing generalization in sequence modeling.
Abstract
Recurrent neural nets are widely used for predicting temporal data. Their inherent deep feedforward structure allows learning complex sequential patterns. It is believed that top-down feedback might be an important missing ingredient which in theory could help disambiguate similar patterns depending on broader context. In this paper we introduce surprisal-driven recurrent networks, which take into account past error information when making new predictions. This is achieved by continuously monitoring the discrepancy between most recent predictions and the actual observations. Furthermore, we show that it outperforms other stochastic and fully deterministic approaches on enwik8 character level prediction task achieving 1.37 BPC on the test portion of the text.
