Convex and Bilevel Optimization for Neuro-Symbolic Inference and Learning
Charles Dickens, Changyu Gao, Connor Pryor, Stephen Wright, Lise Getoor
TL;DR
This work develops a principled gradient-based learning framework for neural-symbolic systems by casting NeSy learning as a bilevel optimization problem and smoothing the lower-level energy with the Moreau envelope. It introduces a smooth LCQP formulation for NeuPSL inference and a dual-BCD method that exploits warm starts and parallelization to achieve substantial runtime speedups (up to $100\times$) while enabling explicit gradient computation with respect to both neural and symbolic weights. Empirical results across eight datasets show improvements in both learning efficiency and predictive performance, including up to $16$ percentage-point gains on MNIST-Add and competitive gains on standard HL-MRF benchmarks. Overall, the framework provides a scalable, end-to-end trainable pathway for integrating neural perception with symbolic reasoning in NeSy systems, with broad applicability to NeuPSL and related NeSy models.
Abstract
We leverage convex and bilevel optimization techniques to develop a general gradient-based parameter learning framework for neural-symbolic (NeSy) systems. We demonstrate our framework with NeuPSL, a state-of-the-art NeSy architecture. To achieve this, we propose a smooth primal and dual formulation of NeuPSL inference and show learning gradients are functions of the optimal dual variables. Additionally, we develop a dual block coordinate descent algorithm for the new formulation that naturally exploits warm-starts. This leads to over 100x learning runtime improvements over the current best NeuPSL inference method. Finally, we provide extensive empirical evaluations across 8 datasets covering a range of tasks and demonstrate our learning framework achieves up to a 16% point prediction performance improvement over alternative learning methods.
