Table of Contents
Fetching ...

A Neurosymbolic Framework for Bias Correction in Convolutional Neural Networks

Parth Padalkar, Natalia Ślusarz, Ekaterina Komendantskaya, Gopal Gupta

TL;DR

The paper tackles biases in CNN-based image classification and proposes a neurosymbolic solution, NeSyBiCor, that uses human-understandable semantic constraints to steer CNN filters via a semantic similarity loss. The approach maps undesired/desirable concepts to vector representations, retrains the network, and re-derives ASP rule-sets with NeSyFOLD, achieving bias correction while preserving accuracy and enhancing interpretability. Key contributions include the semantic loss $ abla_{SS}$, iterative concept-vector recalibration, and demonstrations on Places dataset subsets showing large reductions in undesired predicates and rule-set size. This work advances robust, interpretable CNNs by integrating symbolic constraints with neural learning, and suggests future integration with vision foundation models for automatic semantic segmentation.

Abstract

Recent efforts in interpreting Convolutional Neural Networks (CNNs) focus on translating the activation of CNN filters into a stratified Answer Set Program (ASP) rule-sets. The CNN filters are known to capture high-level image concepts, thus the predicates in the rule-set are mapped to the concept that their corresponding filter represents. Hence, the rule-set exemplifies the decision-making process of the CNN w.r.t the concepts that it learns for any image classification task. These rule-sets help understand the biases in CNNs, although correcting the biases remains a challenge. We introduce a neurosymbolic framework called NeSyBiCor for bias correction in a trained CNN. Given symbolic concepts, as ASP constraints, that the CNN is biased towards, we convert the concepts to their corresponding vector representations. Then, the CNN is retrained using our novel semantic similarity loss that pushes the filters away from (or towards) learning the desired/undesired concepts. The final ASP rule-set obtained after retraining, satisfies the constraints to a high degree, thus showing the revision in the knowledge of the CNN. We demonstrate that our NeSyBiCor framework successfully corrects the biases of CNNs trained with subsets of classes from the "Places" dataset while sacrificing minimal accuracy and improving interpretability.

A Neurosymbolic Framework for Bias Correction in Convolutional Neural Networks

TL;DR

The paper tackles biases in CNN-based image classification and proposes a neurosymbolic solution, NeSyBiCor, that uses human-understandable semantic constraints to steer CNN filters via a semantic similarity loss. The approach maps undesired/desirable concepts to vector representations, retrains the network, and re-derives ASP rule-sets with NeSyFOLD, achieving bias correction while preserving accuracy and enhancing interpretability. Key contributions include the semantic loss , iterative concept-vector recalibration, and demonstrations on Places dataset subsets showing large reductions in undesired predicates and rule-set size. This work advances robust, interpretable CNNs by integrating symbolic constraints with neural learning, and suggests future integration with vision foundation models for automatic semantic segmentation.

Abstract

Recent efforts in interpreting Convolutional Neural Networks (CNNs) focus on translating the activation of CNN filters into a stratified Answer Set Program (ASP) rule-sets. The CNN filters are known to capture high-level image concepts, thus the predicates in the rule-set are mapped to the concept that their corresponding filter represents. Hence, the rule-set exemplifies the decision-making process of the CNN w.r.t the concepts that it learns for any image classification task. These rule-sets help understand the biases in CNNs, although correcting the biases remains a challenge. We introduce a neurosymbolic framework called NeSyBiCor for bias correction in a trained CNN. Given symbolic concepts, as ASP constraints, that the CNN is biased towards, we convert the concepts to their corresponding vector representations. Then, the CNN is retrained using our novel semantic similarity loss that pushes the filters away from (or towards) learning the desired/undesired concepts. The final ASP rule-set obtained after retraining, satisfies the constraints to a high degree, thus showing the revision in the knowledge of the CNN. We demonstrate that our NeSyBiCor framework successfully corrects the biases of CNNs trained with subsets of classes from the "Places" dataset while sacrificing minimal accuracy and improving interpretability.
Paper Structure (12 sections, 6 equations, 4 figures, 2 tables)

This paper contains 12 sections, 6 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The NeSyFOLD Framework
  • Figure 2: Semantic labelling of a predicate
  • Figure 3: The NeSyBiCor Framework. Note that the crossentropy loss is calculated after the fully connected layer while the semantic similarity loss is calculated by using the filter output feature maps of the last convolution layer
  • Figure 4: The initial and final rule-sets after applying the NeSyBiCor framework on the CNNs trained on des (RULE-SET 1) and defs (RULE-SET 2)