Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort

Jeeyung Kim; Ze Wang; Qiang Qiu

Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort

Jeeyung Kim, Ze Wang, Qiang Qiu

TL;DR

The paper tackles spurious correlations in vision by strengthening interpretability through Concept Bottleneck Models (CBMs). It introduces a three-stage framework that uses multimodal foundation models (MLLMs and LLMs) to automatically discover, annotate, and optionally refine visual concepts, enabling near-zero human labeling effort. By collecting concepts unaffected by spurious cues, annotating with LLaVA, and refining via a chain of vision models, the approach yields CBMs that reduce reliance on spurious correlations while preserving interpretability. Empirical results across ImageNet-Opener, Metashifts, and Waterbirds show competitive or superior worst-group robustness compared to baselines, with notable gains when annotation refinement is employed. The work demonstrates a practical pathway to robust, interpretable models with minimal human annotation cost, broadening the applicability of CBMs in real-world datasets.

Abstract

Enhancing model interpretability can address spurious correlations by revealing how models draw their predictions. Concept Bottleneck Models (CBMs) can provide a principled way of disclosing and guiding model behaviors through human-understandable concepts, albeit at a high cost of human efforts in data annotation. In this paper, we leverage a synergy of multiple foundation models to construct CBMs with nearly no human effort. We discover undesirable biases in CBMs built on pre-trained models and propose a novel framework designed to exploit pre-trained models while being immune to these biases, thereby reducing vulnerability to spurious correlations. Specifically, our method offers a seamless pipeline that adopts foundation models for assessing potential spurious correlations in datasets, annotating concepts for images, and refining the annotations for improved robustness. We evaluate the proposed method on multiple datasets, and the results demonstrate its effectiveness in reducing model reliance on spurious correlations while preserving its interpretability.

Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort

TL;DR

Abstract

Paper Structure (24 sections, 5 figures, 14 tables)

This paper contains 24 sections, 5 figures, 14 tables.

Introduction
Related Works
Preliminaries
Limitation of using CLIP-based CBMs to Alleviate Spurious Correlation
Our Framework
Collecting Visual Attributes Unaffected by Spurious Correlation
Annotating the Concepts with LLaVA
Automatic Annotation Refinement
Experiment
Experimental Setups
Results
Conclusion
Prompts and Visual Concepts
Prompts
Collected Visual Attributes in \ref{['sec:collect_va']}
...and 9 more sections

Figures (5)

Figure 1: The framework comprises three stages, where we leverage foundation models to minimize human effort in building CBMs. First, we collect a concept pool of visual attributes necessary for classification with automatic concept filtering using an MLLM. Second, we annotate the concepts by querying a MLLM. Third, we optionally refine the annotations using vision foundation models to improve the accuracy of the concept annotations by MLLMs. Ultimately, based on the obtained concept annotations, we construct a LLaVA-based CBM.
Figure 2: Potential spurious correlations detected on Waterbirds from Step 2. We enumerate strongly and weakly correlated words with coefficients, highlighting the strongly correlated words with red. The known spurious concepts are detected by our method. We set the threshold as 0.3 for coefficients to select highly correlated attributes.
Figure 3: An illustration of annotation refinement process on ImageNet-Opener. The actual example demonstrates how the correction alters the response. The segmented input generated through a chain of VFMs is used as input to LLaVA. The internal process is not visible to human and executed automatically by GPT-3.
Figure 4: Potential spurious correlations detection on Metashifts. All known spurious concepts (bed, sofa, bench, and bike colored as red) are detected by our method. We set the threshold as 0.2 for correlation coefficients to select correlated words. See App. \ref{['all:threshold']} for details on varying thresholds.
Figure A: Label-free CBM's predictions rely on a can.

Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort

TL;DR

Abstract

Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort

Authors

TL;DR

Abstract

Table of Contents

Figures (5)