Interpret Your Decision: Logical Reasoning Regularization for Generalization in Visual Classification
Zhaorui Tan, Xi Yang, Qiufeng Wang, Anh Nguyen, Kaizhu Huang
TL;DR
The paper introduces L-Reg, a logic-informed, sample-based regularization term for visual classification that enforces minimal, disentangled semantic supports to improve generalization. By grounding a formal logical framework and deriving an entropy-based objective, L-Reg reduces classifier complexity and yields interpretable, class-specific minimal features, enhancing performance in multi-domain generalization and generalized category discovery settings. Theoretical analysis plus extensive experiments show consistent gains across diverse datasets and even non-vision tasks like circuit congestion prediction, while also outlining limitations and avenues for improving feature independence and layer selection. Overall, L-Reg offers a practical, plug-and-play regularization that improves robustness to domain shifts and unknown categories with interpretable semantics.
Abstract
Vision models excel in image classification but struggle to generalize to unseen data, such as classifying images from unseen domains or discovering novel categories. In this paper, we explore the relationship between logical reasoning and deep learning generalization in visual classification. A logical regularization termed L-Reg is derived which bridges a logical analysis framework to image classification. Our work reveals that L-Reg reduces the complexity of the model in terms of the feature distribution and classifier weights. Specifically, we unveil the interpretability brought by L-Reg, as it enables the model to extract the salient features, such as faces to persons, for classification. Theoretical analysis and experiments demonstrate that L-Reg enhances generalization across various scenarios, including multi-domain generalization and generalized category discovery. In complex real-world scenarios where images span unknown classes and unseen domains, L-Reg consistently improves generalization, highlighting its practical efficacy.
