Deciphering Raw Data in Neuro-Symbolic Learning with Provable Guarantees
Lue Tao, Yu-Xuan Huang, Wang-Zhou Dai, Yuan Jiang
TL;DR
This paper addresses the unclear theoretical foundations of learnability in neuro-symbolic learning by introducing a novel characterization of supervision signals from a knowledge base and a rank-based criterion to predict learning efficacy. It formalises location-based (L-Risk) and target-location (TL-Risk) risks and proves upper-bounds linking these risks to the standard NeSy objective, with a rank condition on a probability matrix $\tilde{Q}$ guaranteeing recovery of ground-truth labels when full row rank holds. The authors provide concrete procedures to diagnose KB usefulness before training and demonstrate, through extensive experiments on ConjEq, HED, Conjunction, Addition, and random KBs, that the rank criterion reliably predicts learning success and failure across different abduction strategies. The work offers a practical theoretical framework to guide hybrid learning systems, potentially improving the reliability and deployment of neuro-symbolic methods in real-world tasks.
Abstract
Neuro-symbolic hybrid systems are promising for integrating machine learning and symbolic reasoning, where perception models are facilitated with information inferred from a symbolic knowledge base through logical reasoning. Despite empirical evidence showing the ability of hybrid systems to learn accurate perception models, the theoretical understanding of learnability is still lacking. Hence, it remains unclear why a hybrid system succeeds for a specific task and when it may fail given a different knowledge base. In this paper, we introduce a novel way of characterising supervision signals from a knowledge base, and establish a criterion for determining the knowledge's efficacy in facilitating successful learning. This, for the first time, allows us to address the two questions above by inspecting the knowledge base under investigation. Our analysis suggests that many knowledge bases satisfy the criterion, thus enabling effective learning, while some fail to satisfy it, indicating potential failures. Comprehensive experiments confirm the utility of our criterion on benchmark tasks.
