Interpretable Neural-Symbolic Concept Reasoning
Pietro Barbiero, Gabriele Ciravegna, Francesco Giannini, Mateo Espinosa Zarlenga, Lucie Charlotte Magister, Alberto Tonda, Pietro Lio', Frederic Precioso, Mateja Jamnik, Giuseppe Marra
TL;DR
The paper tackles the interpretability gap in deep learning by introducing the Deep Concept Reasoner (DCR), a novel interpretable concept-based model that uses concept embeddings to generate differentiable fuzzy rules and executes them on semantically meaningful concept truth degrees. DCR delivers semantically meaningful, differentiable rule-based predictions, enabling global explanations and counterfactuals, while achieving competitive accuracy on a diverse set of tasks. The key contributions include a formal rule syntax, differentiable rule generation and execution via neural modules for role and relevance, a parsimony mechanism with a Gödel-based fuzzy semantics, and comprehensive experiments showing superior generalization, interpretable rule discovery aligned with ground truth, and robust counterfactual explanations. The approach bridges concept-based and neural-symbolic paradigms, offering practical impact for trustworthy AI by combining accuracy, interpretability, and actionable explanations across tabular, image, and graph domains.
Abstract
Deep learning methods are highly accurate, yet their opaque decision process prevents them from earning full human trust. Concept-based models aim to address this issue by learning tasks based on a set of human-understandable concepts. However, state-of-the-art concept-based models rely on high-dimensional concept embedding representations which lack a clear semantic meaning, thus questioning the interpretability of their decision process. To overcome this limitation, we propose the Deep Concept Reasoner (DCR), the first interpretable concept-based model that builds upon concept embeddings. In DCR, neural networks do not make task predictions directly, but they build syntactic rule structures using concept embeddings. DCR then executes these rules on meaningful concept truth degrees to provide a final interpretable and semantically-consistent prediction in a differentiable manner. Our experiments show that DCR: (i) improves up to +25% w.r.t. state-of-the-art interpretable concept-based models on challenging benchmarks (ii) discovers meaningful logic rules matching known ground truths even in the absence of concept supervision during training, and (iii), facilitates the generation of counterfactual examples providing the learnt rules as guidance.
