Honey, I shrunk the hypothesis space (through logical preprocessing)
Andrew Cropper, Filipe Gouveia, David M. Cerna
TL;DR
The paper addresses the prohibitive size of hypothesis spaces in inductive logic programming by introducing shrinker, a preprocessing tool that uses background knowledge to automatically identify and remove pointless rules (unsatisfiable, implication reducible, recall reducible, and singleton reducible). Implemented via answer set programming and integrated with a constraint-based ILP system (Popper), shrinker preserves optimal hypotheses while dramatically reducing learning times across diverse domains, often with only seconds of preprocessing. The approach is task- and system-agnostic, offering a principled, formal method to constrain hypothesis generation before learning, and it highlights both empirical gains and clear limitations (finite BK, closed-world assumption, monotonic ILP). Overall, shrinker offers a practical pathway to faster ILP with provable soundness, enabling scalable rule learning and reuse across tasks and domains.
Abstract
Inductive logic programming (ILP) is a form of logical machine learning. The goal is to search a hypothesis space for a hypothesis that generalises training examples and background knowledge. We introduce an approach that 'shrinks' the hypothesis space before an ILP system searches it. Our approach uses background knowledge to find rules that cannot be in an optimal hypothesis regardless of the training examples. For instance, our approach discovers relationships such as "even numbers cannot be odd" and "prime numbers greater than 2 are odd". It then removes violating rules from the hypothesis space. We implement our approach using answer set programming and use it to shrink the hypothesis space of a constraint-based ILP system. Our experiments on multiple domains, including visual reasoning and game playing, show that our approach can substantially reduce learning times whilst maintaining predictive accuracies. For instance, given just 10 seconds of preprocessing time, our approach can reduce learning times from over 10 hours to only 2 seconds.
