Table of Contents
Fetching ...

Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs

Siyu Lou, Yuntian Chen, Xiaodan Liang, Liang Lin, Quanshi Zhang

TL;DR

The paper tackles the challenge of disentangling memorization and in-context reasoning in large language models by introducing an axiomatic framework that decomposes the LLM confidence score $v(x_{n+1}|\mathbf{x})$ into AND/OR interactions. It formalizes context-agnostic memorization as foundational and chaotic components and in-context reasoning as enhanced, eliminated, or reversed components, all constrained by two axioms (condition dependence and variable independence) and supported by sparsity and universal-matching properties. The authors prove that the decomposed effects faithfully reconstruct the model output and provide a practical protocol to quantify these effects via logically equivalent prompts, enabling fine-grained analyses across multiple models (OPT-1.3B, LLaMA-7B, GPT-3.5-Turbo). Empirical results show substantial in-context reasoning effects with relatively weak chaotic memorization, and reveal that premises mainly suppress high-order memorization interactions while slightly modifying low-order ones, supporting semantic debugging and adversarial prompt design. Overall, the framework offers interpretable, faithful insights into LLM inference and practical avenues for debugging and robust prompt engineering.

Abstract

In this study, we propose an axiomatic system to define and quantify the precise memorization and in-context reasoning effects used by the large language model (LLM) for language generation. These effects are formulated as non-linear interactions between tokens/words encoded by the LLM. Specifically, the axiomatic system enables us to categorize the memorization effects into foundational memorization effects and chaotic memorization effects, and further classify in-context reasoning effects into enhanced inference patterns, eliminated inference patterns, and reversed inference patterns. Besides, the decomposed effects satisfy the sparsity property and the universal matching property, which mathematically guarantee that the LLM's confidence score can be faithfully decomposed into the memorization effects and in-context reasoning effects. Experiments show that the clear disentanglement of memorization effects and in-context reasoning effects enables a straightforward examination of detailed inference patterns encoded by LLMs.

Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs

TL;DR

The paper tackles the challenge of disentangling memorization and in-context reasoning in large language models by introducing an axiomatic framework that decomposes the LLM confidence score into AND/OR interactions. It formalizes context-agnostic memorization as foundational and chaotic components and in-context reasoning as enhanced, eliminated, or reversed components, all constrained by two axioms (condition dependence and variable independence) and supported by sparsity and universal-matching properties. The authors prove that the decomposed effects faithfully reconstruct the model output and provide a practical protocol to quantify these effects via logically equivalent prompts, enabling fine-grained analyses across multiple models (OPT-1.3B, LLaMA-7B, GPT-3.5-Turbo). Empirical results show substantial in-context reasoning effects with relatively weak chaotic memorization, and reveal that premises mainly suppress high-order memorization interactions while slightly modifying low-order ones, supporting semantic debugging and adversarial prompt design. Overall, the framework offers interpretable, faithful insights into LLM inference and practical avenues for debugging and robust prompt engineering.

Abstract

In this study, we propose an axiomatic system to define and quantify the precise memorization and in-context reasoning effects used by the large language model (LLM) for language generation. These effects are formulated as non-linear interactions between tokens/words encoded by the LLM. Specifically, the axiomatic system enables us to categorize the memorization effects into foundational memorization effects and chaotic memorization effects, and further classify in-context reasoning effects into enhanced inference patterns, eliminated inference patterns, and reversed inference patterns. Besides, the decomposed effects satisfy the sparsity property and the universal matching property, which mathematically guarantee that the LLM's confidence score can be faithfully decomposed into the memorization effects and in-context reasoning effects. Experiments show that the clear disentanglement of memorization effects and in-context reasoning effects enables a straightforward examination of detailed inference patterns encoded by LLMs.
Paper Structure (19 sections, 3 theorems, 19 equations, 11 figures, 3 tables)

This paper contains 19 sections, 3 theorems, 19 equations, 11 figures, 3 tables.

Key Result

Theorem 1

Let us be given an input sample $\mathbf{x}=[x_1,\dots,x_n]$ with $n$ input variables, and a DNN with smoothren2024where have proposed three conditions to define a DNN with smooth output score on different masked samples. These conditions can be briefly summarized as follows. (1) the DNN does not en

Figures (11)

  • Figure 1: Illustration of interactions encoded by the LLM and the decomposition of their effects into memorization and in-context reasoning effect. The LLM's confidence score can be decomposed into a few interactions, such as $S_2 = \{\textit{Einstein's}, \textit{transmission}\}$. Each interaction has a numerical effect $I(S)$ to the LLM's confidence score. Then, we can decompose the interaction effect into the memorization effects (foundational memorization effect and chaotic memorization effect) and the in-context reasoning effect. The in-context reasoning effect can be further categorized as the enhanced inference pattern, the eliminated inference pattern or the reversed inference pattern.
  • Figure 2: Illustration of the decomposition of an interaction into the foundational memorization interaction ($\mathcal{J}^{\text{f}}_{\text{and}}$, $\mathcal{J}^{\text{f}}_{\text{or}}$), the chaotic memorization interaction ($\mathcal{J}^{\text{c}}_{\text{and}}$, $\mathcal{J}^{\text{c}}_{\text{or}}$), and the in-context reasoning interaction ($\mathcal{K}_{\text{and}}$, $\mathcal{K}_{\text{or}}$).
  • Figure 3: Distribution of the strength of foundational memorization interactions ($\mathcal{J}^{\text{f}}_{\text{and}}(S|\mathbf{x}), \mathcal{J}^{\text{f}}_{\text{or}}(S|\mathbf{x})$), chaotic memorization interactions ($\mathcal{J}^{\text{c}}_{\text{and}}(S|\mathbf{x}), \mathcal{J}^{\text{c}}_{\text{or}}(S|\mathbf{x})$) and in-context reasoning interactions ($\mathcal{K}_{\text{and}}(S|\mathbf{x}), \mathcal{K}_{\text{or}}(S|\mathbf{x})$) by order $m$. The results are averaged over all samples. Please see experimental settings in Section \ref{['sec:fine-grained']} for details, and Appendix \ref{['appendix:result']} for results on the individual sample.
  • Figure 4: (a) Distribution of the strength of three types of in-context reasoning effects over different orders $m$. The results are averaged over all samples. (b) Visualization of some in-context reasoning effects and memorization effects on a single sample.
  • Figure 5: Interactions illustrate problematic inference patterns used by the LLM. Then, we construct adversarial prompts based on the interaction.
  • ...and 6 more figures

Theorems & Definitions (6)

  • Theorem 1: Sparsity property, proved by ren2024where
  • Theorem 2: Universal matching property, proved by zhou2023explaining
  • proof
  • proof
  • Lemma 1
  • proof