PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning

Hieu Tran; Zonghai Yao; Nguyen Luong Tran; Zhichao Yang; Feiyun Ouyang; Shuo Han; Razieh Rahimi; Hong Yu

PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning

Hieu Tran, Zonghai Yao, Nguyen Luong Tran, Zhichao Yang, Feiyun Ouyang, Shuo Han, Razieh Rahimi, Hong Yu

TL;DR

PRIME tackles efficient, knowledge-intensive reasoning in large language models by mirroring human dual-process cognition. It combines a fast System 1 Quick Thinking Agent with a selective System 2 deliberation triggered by a Reflection Agent, which coordinates Planning, Retrieval, and Hypothesis Testing to ground answers in external evidence. Across medical and multi-hop benchmarks, PRIME improves accuracy and reduces hallucinations, with open-source LLaMA models reaching or surpassing GPT-4o on several tasks. An ablation and difficulty-aware analysis confirms the value of selective System 2 engagement and modular agent collaboration, though gating reliability and latency remain important considerations for deployment.

Abstract

Inspired by the dual-process theory of human cognition from \textit{Thinking, Fast and Slow}, we introduce \textbf{PRIME} (Planning and Retrieval-Integrated Memory for Enhanced Reasoning), a multi-agent reasoning framework that dynamically integrates \textbf{System 1} (fast, intuitive thinking) and \textbf{System 2} (slow, deliberate thinking). PRIME first employs a Quick Thinking Agent (System 1) to generate a rapid answer; if uncertainty is detected, it then triggers a structured System 2 reasoning pipeline composed of specialized agents for \textit{planning}, \textit{hypothesis generation}, \textit{retrieval}, \textit{information integration}, and \textit{decision-making}. This multi-agent design faithfully mimics human cognitive processes and enhances both efficiency and accuracy. Experimental results with LLaMA 3 models demonstrate that PRIME enables open-source LLMs to perform competitively with state-of-the-art closed-source models like GPT-4 and GPT-4o on benchmarks requiring multi-hop and knowledge-grounded reasoning. This research establishes PRIME as a scalable solution for improving LLMs in domains requiring complex, knowledge-intensive reasoning.

PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning

TL;DR

Abstract

PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)