Semantic Loss Functions for Neuro-Symbolic Structured Prediction
Kareem Ahmed, Stefano Teso, Paolo Morettin, Luca Di Liello, Pierfrancesco Ardino, Jacopo Gobbi, Yitao Liang, Eric Wang, Kai-Wei Chang, Andrea Passerini, Guy Van den Broeck
TL;DR
This paper presents semantic loss as a principled, differentiable way to enforce symbolic structure in neural networks for structured output prediction. By compiling Boolean constraints into tractable logical circuits and optimizing over models that satisfy the constraints, the method performs probability-weighted model counting to compute loss; a companion neuro-symbolic entropy term further biases toward valid and confident predictions. The approach is extended to generative modeling through Constrained Adversarial Networks (CANs), enabling generation of valid, structured objects such as game levels and molecules. Empirical results across semi- and fully-supervised tasks show improved coherence and constraint satisfaction, while CANs demonstrate efficient generation of structurally valid objects with dynamic constraint switching. Overall, the framework offers a modular, scalable path to integrate rich symbolic knowledge into both discriminative and generative deep models with strong empirical gains.
Abstract
Structured output prediction problems are ubiquitous in machine learning. The prominent approach leverages neural networks as powerful feature extractors, otherwise assuming the independence of the outputs. These outputs, however, jointly encode an object, e.g. a path in a graph, and are therefore related through the structure underlying the output space. We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training by minimizing the network's violation of such dependencies, steering the network towards predicting distributions satisfying the underlying structure. At the same time, it is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby, while also enabling efficient end-to-end training and inference. We also discuss key improvements and applications of the semantic loss. One limitations of the semantic loss is that it does not exploit the association of every data point with certain features certifying its membership in a target class. We should therefore prefer minimum-entropy distributions over valid structures, which we obtain by additionally minimizing the neuro-symbolic entropy. We empirically demonstrate the benefits of this more refined formulation. Moreover, the semantic loss is designed to be modular and can be combined with both discriminative and generative neural models. This is illustrated by integrating it into generative adversarial networks, yielding constrained adversarial networks, a novel class of deep generative models able to efficiently synthesize complex objects obeying the structure of the underlying domain.
