Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge

Sahil Kale

Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge

Sahil Kale

TL;DR

This work reveals that LLMs often confuse memorized solutions with genuine reasoning, inflating their perceived self-knowledge. It introduces a universal task-perturbation framework and two metrics, MIRAGE and SKEW, to quantify memorization-driven overconfidence and self-knowledge wavering across STEM domains. Experimental results show significant inconsistencies in feasibility judgments (over 45% in many cases) and pronounced effects in science and medicine, underscoring trust and safety concerns. The authors provide a public evaluation pipeline and advocate for safeguards to improve AI explainability and reliability in high-stakes domains.

Abstract

When artificial intelligence mistakes memorization for intelligence, it creates a dangerous mirage of reasoning. Existing studies treat memorization and self-knowledge deficits in LLMs as separate issues and do not recognize an intertwining link that degrades the trustworthiness of LLM responses. In our study, we utilize a novel framework to ascertain if LLMs genuinely learn reasoning patterns from training data or merely memorize them to assume competence across problems of similar complexity focused on STEM domains. Our analysis shows a noteworthy problem in generalization: LLMs draw confidence from memorized solutions to infer a higher self-knowledge about their reasoning ability, which manifests as an over 45% inconsistency in feasibility assessments when faced with self-validated, logically coherent task perturbations. This effect is most pronounced in science and medicine domains, which tend to have maximal standardized jargon and problems, further confirming our approach. Significant wavering within the self-knowledge of LLMs also shows flaws in current architectures and training patterns, highlighting the need for techniques that ensure a balanced, consistent stance on models' perceptions of their own knowledge for maximum AI explainability and trustworthiness. Our code and results are available publicly at https://github.com/Sahil-R-Kale/mirage_of_mastery

Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge

TL;DR

Abstract

Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)