GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge?
Dayoon Ko, Jinyoung Kim, Hahyeon Choi, Gunhee Kim
TL;DR
GrowOVER introduces dynamic QA and dialogue benchmarks (GrowOVER-QA and GrowOVER-Dialogue) that track the evolution of knowledge using annotated evidence text. It also presents RiLM, a training-free retrieval-interactive framework where an LLM evaluates its answers and guides re-retrieval through a certainty classifier and adaptive retrieval. Across extensive experiments, RiLM matches or rivals continuously trained LLMs, highlighting the potential of reinforcement from the model's own reliability signals to cope with knowledge changes. The work emphasizes the practical importance of dynamic benchmarks and retrieval-augmented strategies for maintaining accuracy in rapidly evolving domains.
Abstract
In the real world, knowledge is constantly evolving, which can render existing knowledge-based datasets outdated. This unreliability highlights the critical need for continuous updates to ensure both accuracy and relevance in knowledge-intensive tasks. To address this, we propose GrowOVER-QA and GrowOVER-Dialogue, dynamic open-domain QA and dialogue benchmarks that undergo a continuous cycle of updates, keeping pace with the rapid evolution of knowledge. Our research indicates that retrieval-augmented language models (RaLMs) struggle with knowledge that has not been trained on or recently updated. Consequently, we introduce a novel retrieval-interactive language model framework, where the language model evaluates and reflects on its answers for further re-retrieval. Our exhaustive experiments demonstrate that our training-free framework significantly improves upon existing methods, performing comparably to or even surpassing continuously trained language models.
