Efficiently Learning Probabilistic Logical Models by Cheaply Ranking Mined Rules
Jonathan Feldstein, Dominic Phillips, Efthymia Tsamoura
TL;DR
This work tackles the critical bottleneck of scalable structure learning for probabilistic logical models by introducing SPECTRUM, a framework that combines linear-time pattern mining and rule evaluation with a novel utility measure that balances precision, recall, symmetry, priors, and complexity. The approach yields a quadratic-time optimisation over a carefully restricted rule space, enabling CPU-scale learning on datasets with millions of facts while achieving competitive or superior accuracy to neural baselines. Theoretical guarantees accompany the practical pipeline, including ε-uncertainty bounds for pattern-based utility estimates and completeness for patterns within the N-close neighborhood. Empirically, SPECTRUM scales to large benchmarks and retrieves hand-engineered rules on CAD and Yelp, with CPU runtimes orders of magnitude faster than state-of-the-art neural structure learners, highlighting its potential to broaden adoption of interpretable neurosymbolic reasoning in real-world domains.
Abstract
Probabilistic logical models are a core component of neurosymbolic AI and are important in their own right for tasks that require high explainability. Unlike neural networks, logical theories that underlie the model are often handcrafted using domain expertise, making their development costly and prone to errors. While there are algorithms that learn logical theories from data, they are generally prohibitively expensive, limiting their applicability in real-world settings. Here, we introduce precision and recall for logical rules and define their composition as rule utility - a cost-effective measure of the predictive power of logical theories. We also introduce SPECTRUM, a scalable framework for learning logical theories from relational data. Its scalability derives from a linear-time algorithm for mining recurrent subgraphs in the data graph along with a second algorithm that, using a utility measure that can be computed in linear time, efficiently ranks rules derived from these subgraphs. Finally, we prove theoretical guarantees on the utility of the learnt logical theory. As a result, we demonstrate across various tasks that SPECTRUM scales to larger datasets, often learning more accurate logical theories on CPUs in < 1% the runtime of SOTA neural network approaches on GPUs.
