Table of Contents
Fetching ...

MAPLE: Multi-Agent Adaptive Planning with Long-Term Memory for Table Reasoning

Ye Bai, Minghan Wang, Thuy-Trang Vu

TL;DR

MAPLE addresses the challenge of table-based QA by introducing a multi-agent framework with adaptive planning and long-term memory. By separating reasoning (Solver), verification (Checker), diagnosis (Reflector), and memory management (Archiver), it creates a feedback-driven cycle that iteratively refines solutions and evolves experiences across tasks. Empirical results on WiKiTQ and TabFact show state-of-the-art performance and strong ablations, while memory analyses reveal core error categories and principled thresholds for memory retrieval and evolution. The approach highlights the practical impact of coupling verification and memory evolution with adaptive planning for robust, knowledge-intensive reasoning tasks.

Abstract

Table-based question answering requires complex reasoning capabilities that current LLMs struggle to achieve with single-pass inference. Existing approaches, such as Chain-of-Thought reasoning and question decomposition, lack error detection mechanisms and discard problem-solving experiences, contrasting sharply with how humans tackle such problems. In this paper, we propose MAPLE (Multi-agent Adaptive Planning with Long-term mEmory), a novel framework that mimics human problem-solving through specialized cognitive agents working in a feedback-driven loop. MAPLE integrates 4 key components: (1) a Solver using the ReAct paradigm for reasoning, (2) a Checker for answer verification, (3) a Reflector for error diagnosis and strategy correction, and (4) an Archiver managing long-term memory for experience reuse and evolution. Experiments on WiKiTQ and TabFact demonstrate significant improvements over existing methods, achieving state-of-the-art performance across multiple LLM backbones.

MAPLE: Multi-Agent Adaptive Planning with Long-Term Memory for Table Reasoning

TL;DR

MAPLE addresses the challenge of table-based QA by introducing a multi-agent framework with adaptive planning and long-term memory. By separating reasoning (Solver), verification (Checker), diagnosis (Reflector), and memory management (Archiver), it creates a feedback-driven cycle that iteratively refines solutions and evolves experiences across tasks. Empirical results on WiKiTQ and TabFact show state-of-the-art performance and strong ablations, while memory analyses reveal core error categories and principled thresholds for memory retrieval and evolution. The approach highlights the practical impact of coupling verification and memory evolution with adaptive planning for robust, knowledge-intensive reasoning tasks.

Abstract

Table-based question answering requires complex reasoning capabilities that current LLMs struggle to achieve with single-pass inference. Existing approaches, such as Chain-of-Thought reasoning and question decomposition, lack error detection mechanisms and discard problem-solving experiences, contrasting sharply with how humans tackle such problems. In this paper, we propose MAPLE (Multi-agent Adaptive Planning with Long-term mEmory), a novel framework that mimics human problem-solving through specialized cognitive agents working in a feedback-driven loop. MAPLE integrates 4 key components: (1) a Solver using the ReAct paradigm for reasoning, (2) a Checker for answer verification, (3) a Reflector for error diagnosis and strategy correction, and (4) an Archiver managing long-term memory for experience reuse and evolution. Experiments on WiKiTQ and TabFact demonstrate significant improvements over existing methods, achieving state-of-the-art performance across multiple LLM backbones.

Paper Structure

This paper contains 80 sections, 5 equations, 11 figures, 8 tables.

Figures (11)

  • Figure 1: The MAPLE framework pipeline. 4 agents work collaboratively in a feedback loop: the Solver conducts iterative reasoning using ReAct, the Checker evaluates answer quality, the Reflector diagnoses errors and suggests improvements, and the Archiver manages an evolving long-term memory. This architecture enables dynamic adaptation both within tasks and across similar problems, mirroring human cognitive problem-solving processes.
  • Figure 2: Distribution of error types identified through MAPLE's memory system on WikiTQ.
  • Figure 3: Overview of the memory structures and information flows in MAPLE. The green arrows ($\rightarrow$) represent reasoning processes, where agents read and update working memory during multi-step problem solving. The orange arrows ($\leftarrow$) represent retrieval operations from long-term memory to support current reasoning. The red arrows ($\rightarrow$) denote learning operations, where new knowledge is written back into the long-term memory.
  • Figure 4: Illustrative case study of MAPLE's multi-agent reasoning workflow.
  • Figure 5: Accuracy comparison across table size categories on WikiTQ. Performance is shown for MAPLE (blue), Chain-of-Table (orange), and Chain-of-Thought baseline (green), with both total attempt counts (darker shade) and correct answers (lighter stripe pattern) displayed for each method.
  • ...and 6 more figures