Do Large Language Models Know How Much They Know?
Gabriele Prato, Jerry Huang, Prasanna Parthasarathi, Shagun Sodhani, Sarath Chandar
TL;DR
The paper investigates whether large language models possess an awareness of the extent of their own knowledge by introducing a synthetic diary-entry recall benchmark. It trains models on generated documents and evaluates exact recall of all entries for a given individual, examining how knowledge scope awareness emerges across architectures (e.g., OPT, Pythia, Flan-T5) and as a function of model size and data. Key findings show that, with sufficient scale and proper training, models can accurately quantify what they know about specific topics, though emergence varies with architecture and training setup. This work advances understanding of LLM introspection and has implications for reliability, controllability, and trust in AI systems, while highlighting the need for further cross-model validation and ethical scrutiny. The approach—synthetic, controllable data paired with exact-match recall—offers a framework for probing internal knowledge structures in LLMs beyond simple question answering.
Abstract
Large Language Models (LLMs) have emerged as highly capable systems and are increasingly being integrated into various uses. However, the rapid pace of their deployment has outpaced a comprehensive understanding of their internal mechanisms and a delineation of their capabilities and limitations. A desired attribute of an intelligent system is its ability to recognize the scope of its own knowledge. To investigate whether LLMs embody this characteristic, we develop a benchmark designed to challenge these models to enumerate all information they possess on specific topics. This benchmark evaluates whether the models recall excessive, insufficient, or the precise amount of information, thereby indicating their awareness of their own knowledge. Our findings reveal that all tested LLMs, given sufficient scale, demonstrate an understanding of how much they know about specific topics. While different architectures exhibit varying rates of this capability's emergence, the results suggest that awareness of knowledge may be a generalizable attribute of LLMs. Further research is needed to confirm this potential and fully elucidate the underlying mechanisms.
