From Reasoning to Learning: A Survey on Hypothesis Discovery and Rule Learning with Large Language Models
Kaiyu He, Zhiyu Chen
TL;DR
This survey tackles the challenge of enabling Large Language Models to autonomously discover and refine knowledge through hypotheses and rules, guided by Peirce’s abduction, deduction, and induction. It proposes a structured taxonomy—generation, application, validation, and integrated discovery—and reviews methods across natural-language and formal-language representations, highlighting prompting, retrieval-augmented generation, and human-in-the-loop approaches. The work critically assesses evaluation strategies, benchmarks, and gaps, emphasizing the mismatch between open-ended natural language reasoning and rigorous formal-language evaluation, and advocates for integrated, dynamic environments to approximate real-world scientific discovery. The findings underscore the potential of LLMs as engines of genuine innovation while outlining concrete directions for benchmarks, evaluation, and end-to-end hypothesis-discovery systems with proactive evidence gathering and robust validation. This framework aims to advance LLM-driven scientific inquiry by moving beyond instruction-following toward autonomous hypothesis generation, testing, and refinement with real-world impact.
Abstract
Since the advent of Large Language Models (LLMs), efforts have largely focused on improving their instruction-following and deductive reasoning abilities, leaving open the question of whether these models can truly discover new knowledge. In pursuit of artificial general intelligence (AGI), there is a growing need for models that not only execute commands or retrieve information but also learn, reason, and generate new knowledge by formulating novel hypotheses and theories that deepen our understanding of the world. Guided by Peirce's framework of abduction, deduction, and induction, this survey offers a structured lens to examine LLM-based hypothesis discovery. We synthesize existing work in hypothesis generation, application, and validation, identifying both key achievements and critical gaps. By unifying these threads, we illuminate how LLMs might evolve from mere ``information executors'' into engines of genuine innovation, potentially transforming research, science, and real-world problem solving.
