SegDebias: Test-Time Bias Mitigation for ViT-Based CLIP via Segmentation
Fangyu Wu, Yujun Cai
TL;DR
SegDebias tackles spurious correlations in ViT-based CLIP by introducing a test-time, segmentation-guided debiasing pipeline that requires no bias annotations or retraining. By selecting a target attribute, obtaining a segmentation mask, neutralizing non-target regions through a constrained perturbation, and reconstructing the image for zero-shot inference, it reduces background-driven bias while preserving the target signal. Empirical results on Waterbirds and CelebA show improved worst-group accuracy and smaller performance gaps, complemented by higher Attention-IoU indicating better semantic alignment. The approach is model- and data-agnostic, scalable, and opens avenues for annotation-free bias mitigation in vision-language systems.
Abstract
Vision language models such as CLIP have shown remarkable performance in zero shot classification, but remain susceptible to spurious correlations, where irrelevant visual features influence predictions. Existing debiasing methods often require access to training data and explicit group labels to perform fine-tuning or adjust embeddings, which limits their practicality in real-world settings. Test-time methods attempt to avoid this constraint, but many still depend on prior knowledge of dataset specific biases, limiting their generalizability in open set settings. In this work, we propose a test-time debiasing method for ViT based CLIP models that requires no additional training or assumptions of bias annotations. Our approach uses a pretrained segmentation model to isolate the target visual attribute, then adjusts the non target regions so that their embeddings are uniformly similar to all class specific text prompts. This procedure removes unintended bias signals from confounding visual regions while preserving the target attribute. Experiments on Waterbirds and CelebA show that our method outperforms existing test-time debiasing approaches in both group robustness metrics and Attention IoU. These results demonstrate the effectiveness of segmentation guided interventions for scalable and annotation free bias mitigation in vision language models.
