Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles

Adi Gabay; Gabriel Stanovsky; Liat Peterfreund

Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles

Adi Gabay, Gabriel Stanovsky, Liat Peterfreund

Abstract

Epistemic reasoning requires agents to infer the state of the world from partial observations and information about other agents' knowledge. Prior work evaluating LLMs on canonical epistemic puzzles interpreted their behavior through a dichotomy between epistemic reasoning and brittle memorization. We argue that this framing is incomplete: in recent models, memorization is better understood as a special case of reduction, where a new instance is mapped onto a known problem. Instead, we introduce a reduction ladder, a sequence of modifications that progressively move instances away from a canonical epistemic puzzle, making reduction increasingly difficult while preserving the underlying logic. We find that while some large models succeed via reduction, other models fail early, and all models struggle once epistemic reasoning is required.

Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles

Abstract

Paper Structure (27 sections, 8 figures, 2 tables)

This paper contains 27 sections, 8 figures, 2 tables.

Introduction
Background: Dynamic Epistemic Logic and the Muddy Children Puzzle
Distinguishing Between Reductive and Epistemic Reasoning
Task Definition
Model Input.
Model Output.
The Reduction Ladder
Rung I: Seed problem.
Rung II: Different settings which conflict with world knowledge.
Rung III: Non-symmetric agents.
Experiments
Experimental setup
Models.
Results
Models seem to rely on reductive reasoning when solving epistemic problems.
...and 12 more sections

Figures (8)

Figure 1: The reduction ladder. A progression of puzzle modifications designed to separate epistemic reasoning from reductive reasoning, by gradually obscuring the structure of the classic Muddy Children problem and making reduction-based solutions harder to apply.
Figure 2: An example of the Muddy Children puzzle with two children, one of whom is muddy, where each child observes the other child but not their own forehead. Thought bubbles depict the possibilities the agent considers.
Figure 3: Reduced prompt example in the Muddy Children setup. Each prompt contains a description of the protocol, a short example illustrating the required response format, and a test instance specifying the public interaction history and the target agent's observations.
Figure 4: Accuracy (%) across the epistemic complexity rungs. The results illustrate an overall degradation in accuracy as the tasks move away from the Muddy Children problem and hence require harder reduction.
Figure 5: Accuracy (%) across the epistemic complexity rungs. The results illustrate an overall degradation in accuracy as the tasks move away from the Muddy Children problem and hence require harder reduction.
...and 3 more figures

Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles

Abstract

Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles

Authors

Abstract

Table of Contents

Figures (8)