A Fast and Effective Solution to the Problem of Look-ahead Bias in LLMs
Humzah Merchant, Bradford Levy
TL;DR
This work tackles look-ahead bias in finance by proposing inference-time unlearning, a method that guides an LLM’s outputs without retraining, using two small specialized models to forget or retain information. Through Divergence Decoding, the base model’s logits are adjusted via linear or rank-based logit modifications, effectively removing targeted knowledge while preserving general performance. The approach, grounded in Product of Experts and importance sampling, demonstrates strong unlearning performance on the MUSE benchmark and finance tasks (M&A unlearning and debiasing future performance) with substantial efficiency gains, including viable use of trigram models. The method enables reliable evaluation of chronologically sensitive predictions in finance and potentially broader domains where training on future data is undesirable.
Abstract
Applying LLMs to predictive tasks in finance is challenging due to look-ahead bias resulting from their training on long time-series data. This precludes the backtests typically employed in finance since retraining frontier models from scratch with a specific knowledge cutoff is prohibitive. In this paper, we introduce a fast, effective, and low-cost alternative. Our method guides generation at inference time by adjusting the logits of a large base model using a pair of smaller, specialized models -- one fine-tuned on information to be forgotten and another on information to be retained. We demonstrate that our method effectively removes both verbatim and semantic knowledge, corrects biases, and outperforms prior methods.
