Table of Contents
Fetching ...

Studying How to Efficiently and Effectively Guide Models with Explanations

Sukrut Rao, Moritz Böhle, Amin Parchami-Araghi, Bernt Schiele

TL;DR

The paper tackles the problem of neural networks relying on spurious cues and proposes explicit guidance through explanations by jointly optimizing classification and localization of attributions. It introduces a differentiable Energy loss based on the Energy-based Pointing Game (EPG) and evaluates multiple attribution methods, architectures, and guidance depths on real-world datasets VOC2007 and COCO2014, emphasizing bounding-box supervision for cost-effectiveness. Key findings show Energy loss yields the best on-object localization (EPG), while $L_1$ bests IoU; final-layer guidance is widely effective, and input-layer B-cos explanations provide the most detailed object-focused maps. The results demonstrate robustness to noisy or partial annotations and improved generalization under distribution shifts (e.g., Waterbirds), offering practical, scalable guidance for trustworthy model reasoning in vision tasks. The work also contributes comprehensive, Pareto-aware evaluation across diverse configurations and supplies code for reproducibility.

Abstract

Despite being highly performant, deep neural networks might base their decisions on features that spuriously correlate with the provided labels, thus hurting generalization. To mitigate this, 'model guidance' has recently gained popularity, i.e. the idea of regularizing the models' explanations to ensure that they are "right for the right reasons". While various techniques to achieve such model guidance have been proposed, experimental validation of these approaches has thus far been limited to relatively simple and / or synthetic datasets. To better understand the effectiveness of the various design choices that have been explored in the context of model guidance, in this work we conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets. As annotation costs for model guidance can limit its applicability, we also place a particular focus on efficiency. Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks, and evaluate the robustness of model guidance under limited (e.g. with only 1% of annotated images) or overly coarse annotations. Further, we propose using the EPG score as an additional evaluation metric and loss function ('Energy loss'). We show that optimizing for the Energy loss leads to models that exhibit a distinct focus on object-specific features, despite only using bounding box annotations that also include background regions. Lastly, we show that such model guidance can improve generalization under distribution shifts. Code available at: https://github.com/sukrutrao/Model-Guidance.

Studying How to Efficiently and Effectively Guide Models with Explanations

TL;DR

The paper tackles the problem of neural networks relying on spurious cues and proposes explicit guidance through explanations by jointly optimizing classification and localization of attributions. It introduces a differentiable Energy loss based on the Energy-based Pointing Game (EPG) and evaluates multiple attribution methods, architectures, and guidance depths on real-world datasets VOC2007 and COCO2014, emphasizing bounding-box supervision for cost-effectiveness. Key findings show Energy loss yields the best on-object localization (EPG), while bests IoU; final-layer guidance is widely effective, and input-layer B-cos explanations provide the most detailed object-focused maps. The results demonstrate robustness to noisy or partial annotations and improved generalization under distribution shifts (e.g., Waterbirds), offering practical, scalable guidance for trustworthy model reasoning in vision tasks. The work also contributes comprehensive, Pareto-aware evaluation across diverse configurations and supplies code for reproducibility.

Abstract

Despite being highly performant, deep neural networks might base their decisions on features that spuriously correlate with the provided labels, thus hurting generalization. To mitigate this, 'model guidance' has recently gained popularity, i.e. the idea of regularizing the models' explanations to ensure that they are "right for the right reasons". While various techniques to achieve such model guidance have been proposed, experimental validation of these approaches has thus far been limited to relatively simple and / or synthetic datasets. To better understand the effectiveness of the various design choices that have been explored in the context of model guidance, in this work we conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets. As annotation costs for model guidance can limit its applicability, we also place a particular focus on efficiency. Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks, and evaluate the robustness of model guidance under limited (e.g. with only 1% of annotated images) or overly coarse annotations. Further, we propose using the EPG score as an additional evaluation metric and loss function ('Energy loss'). We show that optimizing for the Energy loss leads to models that exhibit a distinct focus on object-specific features, despite only using bounding box annotations that also include background regions. Lastly, we show that such model guidance can improve generalization under distribution shifts. Code available at: https://github.com/sukrutrao/Model-Guidance.
Paper Structure (34 sections, 6 equations, 38 figures, 4 tables)

This paper contains 34 sections, 6 equations, 38 figures, 4 tables.

Figures (38)

  • Figure 1: (a) Model guidance increases object focus. Models may rely on irrelevant background features or spurious correlations (e.g. presence of person provides positive evidence for bicycle, center row, col. 1). Guiding the model via bounding box annotations can mitigate this and consistently increases the focus on object features (bottom row). (b) Model guidance can improve accuracy. In the presence of spurious correlations in the training data, non-guided models might focus on the wrong features. In the example image in (b), the waterbird is incorrectly classified to be a landbird due to the background (col. 3). Guiding the model via bounding box annotation (as shown in col. 2), the model can be guided to focus on the bird features for classification (col. 4).
  • Figure 2: Qualitative results of model guidance. We show model-inherent B-cos explanations (input layer) of a B-cos ResNet-50 and GradCAM explanations (final layer) of a conventional ResNet-50 before ('Standard') and after optimization ('Guided') for images from the VOC test set, using our proposed Energy loss (\ref{['eq:energyloss']}). Guiding the model via bounding box annotations consistently increases the focus on object features for both methods. Specifically, we find that background attributions are consistently suppressed in both cases.
  • Figure 3: Model guidance overview. We jointly optimize for classification ($\mathcal{L}_\text{class}$) and localization of attributions to human-annotated bounding boxes ($\mathcal{L}_\text{loc}$), to guide the model to focus on object features. Various localization loss functions can be used, see \ref{['sec:method:losses']}.
  • Figure 4: Selecting models for evaluation. For each configuration, we evaluate every model at every checkpoint and measure its performance across various metrics (F1, EPG, IoU) on the validation set; i.e. every point in the left graph corresponds to one model (for B-cos models optimized via the Energy loss at the input layer). Instead of evaluating a single model on the test set, we evaluate all Pareto-dominant models, as indicated in the center and right plot.
  • Figure 5: EPG vs. F1, for different datasets ((a): VOC; (b): COCO), losses (markers) and models (columns), optimized at different layers (rows); additionally, we show the performance of the baseline model before fine-tuning and demarcate regions that strictly dominate (are strictly dominated by) the baseline performance in green (grey). For each configuration, we show the Pareto fronts (cf. \ref{['fig:pareto_example']}) across regularization strengths $\lambda_\text{loc}$ and epochs (cf. \ref{['sec:results']} and \ref{['fig:pareto_example']}). We find the Energy loss to give the best trade-off between EPG and F1.
  • ...and 33 more figures