On Background Bias of Post-Hoc Concept Embeddings in Computer Vision DNNs
Gesina Schwalbe, Georgii Mikriukov, Edgar Heinert, Stavros Gerolymatos, Mert Keser, Alois Knoll, Matthias Rottmann, Annika Mütze
TL;DR
This work interrogates whether post-hoc concept embeddings used in concept-based XAI for computer vision are biased by image backgrounds. It introduces three low-cost background-randomization techniques—image pasting with Places205 backgrounds, Voronoi-patched backgrounds, and diffusion-generated backgrounds—to test robustness and to train CE representations. Across >50 concepts, two data sets, and seven architectures, the study finds notable background biases (e.g., road scenes degrade animal concept segmentation) and shows that background-randomized training improves background robustness, with LoCE and its globalized variant (GloCE) often outperform baseline Net2Vec. The results demonstrate a practical workflow for bias discovery and mitigation in post-hoc C-XAI, suggesting that even cheap interventions can significantly enhance the reliability of explanations in safety-critical CV tasks.
Abstract
The thriving research field of concept-based explainable artificial intelligence (C-XAI) investigates how human-interpretable semantic concepts embed in the latent spaces of deep neural networks (DNNs). Post-hoc approaches therein use a set of examples to specify a concept, and determine its embeddings in DNN latent space using data driven techniques. This proved useful to uncover biases between different target (foreground or concept) classes. However, given that the background is mostly uncontrolled during training, an important question has been left unattended so far: Are/to what extent are state-of-the-art, data-driven post-hoc C-XAI approaches themselves prone to biases with respect to their backgrounds? E.g., wild animals mostly occur against vegetation backgrounds, and they seldom appear on roads. Even simple and robust C-XAI methods might abuse this shortcut for enhanced performance. A dangerous performance degradation of the concept-corner cases of animals on the road could thus remain undiscovered. This work validates and thoroughly confirms that established Net2Vec-based concept segmentation techniques frequently capture background biases, including alarming ones, such as underperformance on road scenes. For the analysis, we compare 3 established techniques from the domain of background randomization on >50 concepts from 2 datasets, and 7 diverse DNN architectures. Our results indicate that even low-cost setups can provide both valuable insight and improved background robustness.
