Table of Contents
Fetching ...

Hypothesis Generation via LLM-Automated Language Bias for ILP

Yang Yang, Jiemin Wu, Yutao Yue

TL;DR

This work tackles the dependence of inductive logic programming (ILP) on expert-crafted language bias by introducing a multi-agent LLM framework that automatically constructs a predicate system from raw text. The predicate system then guides symbolic knowledge encoding into Prolog facts, after which an ILP solver using the Minimum Description Length (MDL) objective via MAXSYNTH induces globally coherent Horn-rule sets; the MDL cost is given by $cost = ext{program size} + ext{#FP} + ext{#FN}$. Evaluations on SHOES and ZENDO across multiple LLM backends demonstrate superior accuracy and robustness relative to HypoGeniC and Iterative Hypothesis Refinement, particularly in relationally complex tasks, while maintaining model-agnostic stability. The results support the approach as a practical, explainable path to open-domain hypothesis generation that reduces manual bias engineering, with limitations noted for extension to richer real-world text and semantics.

Abstract

Inductive Logic Programming (ILP) is a principled approach for generalizing regularities from data and constructing hypotheses as interpretable logic programs. However, a key limitation is its reliance on expert-crafted language bias - the predicate inventory, types, and mode declarations that delimit the search space. We propose hypothesis generation via LLM-automated language bias: multi-agent LLMs design the bias from raw text and translate descriptions into typed facts, and a robust ILP solver induces rules under a global consistency objective. This approach reduces traditional ILP's reliance on predefined symbolic structures and the noise sensitivity of LLM-only pipelines that directly generate hypotheses as text or code. Extensive experiments in diverse, challenging scenarios validate superior performance, providing a practical, explainable, and verifiable route to hypothesis generation.

Hypothesis Generation via LLM-Automated Language Bias for ILP

TL;DR

This work tackles the dependence of inductive logic programming (ILP) on expert-crafted language bias by introducing a multi-agent LLM framework that automatically constructs a predicate system from raw text. The predicate system then guides symbolic knowledge encoding into Prolog facts, after which an ILP solver using the Minimum Description Length (MDL) objective via MAXSYNTH induces globally coherent Horn-rule sets; the MDL cost is given by . Evaluations on SHOES and ZENDO across multiple LLM backends demonstrate superior accuracy and robustness relative to HypoGeniC and Iterative Hypothesis Refinement, particularly in relationally complex tasks, while maintaining model-agnostic stability. The results support the approach as a practical, explainable path to open-domain hypothesis generation that reduces manual bias engineering, with limitations noted for extension to richer real-world text and semantics.

Abstract

Inductive Logic Programming (ILP) is a principled approach for generalizing regularities from data and constructing hypotheses as interpretable logic programs. However, a key limitation is its reliance on expert-crafted language bias - the predicate inventory, types, and mode declarations that delimit the search space. We propose hypothesis generation via LLM-automated language bias: multi-agent LLMs design the bias from raw text and translate descriptions into typed facts, and a robust ILP solver induces rules under a global consistency objective. This approach reduces traditional ILP's reliance on predefined symbolic structures and the noise sensitivity of LLM-only pipelines that directly generate hypotheses as text or code. Extensive experiments in diverse, challenging scenarios validate superior performance, providing a practical, explainable, and verifiable route to hypothesis generation.

Paper Structure

This paper contains 36 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Illustration of our pipeline and example rule. The LLMs produce an ILP language bias and typed facts, and a verifiable ILP solver induces rules under a global-consistency objective. Example hypothesis returned by the solver:zendo(A) :- piece(A,C), size(C,B), blue(C), small(B), contact(C,D), red(D).Meaning (variable types):$A$: scene, $C,D$: pieces, $B$: size attribute. The rule states that $zendo(A)$ holds iff there exists a blue piece $C$ in scene $A$ that is small (via $size(C,B)\,\wedge\,small(B)$) and is in contact with a red piece $D$.
  • Figure 2: The five subplots present the F1 scores of different methods on the BUSINESS SHOES and ZENDO datasets, each examining one key experimental variable: rule number, template number, sample size, positive ratio, and noise ratio.