SafeFix: Targeted Model Repair via Controlled Image Generation

Ouyang Xu; Baoming Zhang; Ruiyu Mao; Yunhui Guo

SafeFix: Targeted Model Repair via Controlled Image Generation

Ouyang Xu, Baoming Zhang, Ruiyu Mao, Yunhui Guo

TL;DR

SafeFix addresses the problem of model failures arising from underrepresented semantic subpopulations by coupling failure-attribute diagnosis with targeted, attribute-preserving synthetic generation. It uses a structure-guided diffusion model (ControlNet with Stable Diffusion) to create edit variants of real training instances, and employs large vision–language models to verify semantic fidelity and label consistency before retraining the model on the augmented data. Empirical results on CelebA and ImageNet10 across ResNet, ViT, and CLIP backbones show consistent improvements on rare-case slices without degrading overall accuracy, outperforming prior generative-augmentation baselines. The approach demonstrates that closed-loop, diagnosis-driven data synthesis, filtered by LVLMs and anchored in the original distribution, can robustly repair targeted failures with practical implications for fairness and reliability in vision systems.

Abstract

Deep learning models for visual recognition often exhibit systematic errors due to underrepresented semantic subpopulations. Although existing debugging frameworks can pinpoint these failures by identifying key failure attributes, repairing the model effectively remains difficult. Current solutions often rely on manually designed prompts to generate synthetic training images -- an approach prone to distribution shift and semantic errors. To overcome these challenges, we introduce a model repair module that builds on an interpretable failure attribution pipeline. Our approach uses a conditional text-to-image model to generate semantically faithful and targeted images for failure cases. To preserve the quality and relevance of the generated samples, we further employ a large vision-language model (LVLM) to filter the outputs, enforcing alignment with the original data distribution and maintaining semantic consistency. By retraining vision models with this rare-case-augmented synthetic dataset, we significantly reduce errors associated with rare cases. Our experiments demonstrate that this targeted repair strategy improves model robustness without introducing new bugs. Code is available at https://github.com/oxu2/SafeFix

SafeFix: Targeted Model Repair via Controlled Image Generation

TL;DR

Abstract

SafeFix: Targeted Model Repair via Controlled Image Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)