Table of Contents
Fetching ...

Faster ISNet for Background Bias Mitigation on Deep Neural Networks

Pedro R. A. S. Bassi, Sergio Decherchi, Andrea Cavalli

TL;DR

This work proposes reformulated architectures, dubbed Faster ISNets, whose training time becomes independent from this number, and introduces LRP optimization into a gamut of applications that the original ISNet model cannot feasibly handle.

Abstract

Bias or spurious correlations in image backgrounds can impact neural networks, causing shortcut learning (Clever Hans Effect) and hampering generalization to real-world data. ISNet, a recently introduced architecture, proposed the optimization of Layer-Wise Relevance Propagation (LRP, an explanation technique) heatmaps, to mitigate the influence of backgrounds on deep classifiers. However, ISNet's training time scales linearly with the number of classes in an application. Here, we propose reformulated architectures whose training time becomes independent from this number. Additionally, we introduce a concise and model-agnostic LRP implementation. We challenge the proposed architectures using synthetic background bias, and COVID-19 detection in chest X-rays, an application that commonly presents background bias. The networks hindered background attention and shortcut learning, surpassing multiple state-of-the-art models on out-of-distribution test datasets. Representing a potentially massive training speed improvement over ISNet, the proposed architectures introduce LRP optimization into a gamut of applications that the original model cannot feasibly handle.

Faster ISNet for Background Bias Mitigation on Deep Neural Networks

TL;DR

This work proposes reformulated architectures, dubbed Faster ISNets, whose training time becomes independent from this number, and introduces LRP optimization into a gamut of applications that the original ISNet model cannot feasibly handle.

Abstract

Bias or spurious correlations in image backgrounds can impact neural networks, causing shortcut learning (Clever Hans Effect) and hampering generalization to real-world data. ISNet, a recently introduced architecture, proposed the optimization of Layer-Wise Relevance Propagation (LRP, an explanation technique) heatmaps, to mitigate the influence of backgrounds on deep classifiers. However, ISNet's training time scales linearly with the number of classes in an application. Here, we propose reformulated architectures whose training time becomes independent from this number. Additionally, we introduce a concise and model-agnostic LRP implementation. We challenge the proposed architectures using synthetic background bias, and COVID-19 detection in chest X-rays, an application that commonly presents background bias. The networks hindered background attention and shortcut learning, surpassing multiple state-of-the-art models on out-of-distribution test datasets. Representing a potentially massive training speed improvement over ISNet, the proposed architectures introduce LRP optimization into a gamut of applications that the original model cannot feasibly handle.
Paper Structure (35 sections, 22 equations, 8 figures, 2 tables)

This paper contains 35 sections, 22 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: LRP heatmaps for X-rays. The pneumonia X-ray has a clear background bias, a mark over the right shoulder. It did not influence the Faster or Original ISNets.
  • Figure 2: Illustration of residual block. Neural network layers are indicated in green, black text indicates the signal, while red text shows the corresponding LRP relevance.
  • Figure 3: Plot of the f(s) function, for $C_{1}=1$ and $C_{2}=3$.
  • Figure 4: Example of training sample from biased MNIST. The figure represents the digit 1. Accordingly, the second pixel (from left to right) in the image's top row was set to 1, representing background bias.
  • Figure 5: Example of training sample from biased Stanford Dogs. The figure represents the Pug breed, identified by the number 102 in the training dataset (synthetic bias).
  • ...and 3 more figures