Table of Contents
Fetching ...

SafeFix: Targeted Model Repair via Controlled Image Generation

Ouyang Xu, Baoming Zhang, Ruiyu Mao, Yunhui Guo

TL;DR

SafeFix addresses the problem of model failures arising from underrepresented semantic subpopulations by coupling failure-attribute diagnosis with targeted, attribute-preserving synthetic generation. It uses a structure-guided diffusion model (ControlNet with Stable Diffusion) to create edit variants of real training instances, and employs large vision–language models to verify semantic fidelity and label consistency before retraining the model on the augmented data. Empirical results on CelebA and ImageNet10 across ResNet, ViT, and CLIP backbones show consistent improvements on rare-case slices without degrading overall accuracy, outperforming prior generative-augmentation baselines. The approach demonstrates that closed-loop, diagnosis-driven data synthesis, filtered by LVLMs and anchored in the original distribution, can robustly repair targeted failures with practical implications for fairness and reliability in vision systems.

Abstract

Deep learning models for visual recognition often exhibit systematic errors due to underrepresented semantic subpopulations. Although existing debugging frameworks can pinpoint these failures by identifying key failure attributes, repairing the model effectively remains difficult. Current solutions often rely on manually designed prompts to generate synthetic training images -- an approach prone to distribution shift and semantic errors. To overcome these challenges, we introduce a model repair module that builds on an interpretable failure attribution pipeline. Our approach uses a conditional text-to-image model to generate semantically faithful and targeted images for failure cases. To preserve the quality and relevance of the generated samples, we further employ a large vision-language model (LVLM) to filter the outputs, enforcing alignment with the original data distribution and maintaining semantic consistency. By retraining vision models with this rare-case-augmented synthetic dataset, we significantly reduce errors associated with rare cases. Our experiments demonstrate that this targeted repair strategy improves model robustness without introducing new bugs. Code is available at https://github.com/oxu2/SafeFix

SafeFix: Targeted Model Repair via Controlled Image Generation

TL;DR

SafeFix addresses the problem of model failures arising from underrepresented semantic subpopulations by coupling failure-attribute diagnosis with targeted, attribute-preserving synthetic generation. It uses a structure-guided diffusion model (ControlNet with Stable Diffusion) to create edit variants of real training instances, and employs large vision–language models to verify semantic fidelity and label consistency before retraining the model on the augmented data. Empirical results on CelebA and ImageNet10 across ResNet, ViT, and CLIP backbones show consistent improvements on rare-case slices without degrading overall accuracy, outperforming prior generative-augmentation baselines. The approach demonstrates that closed-loop, diagnosis-driven data synthesis, filtered by LVLMs and anchored in the original distribution, can robustly repair targeted failures with practical implications for fairness and reliability in vision systems.

Abstract

Deep learning models for visual recognition often exhibit systematic errors due to underrepresented semantic subpopulations. Although existing debugging frameworks can pinpoint these failures by identifying key failure attributes, repairing the model effectively remains difficult. Current solutions often rely on manually designed prompts to generate synthetic training images -- an approach prone to distribution shift and semantic errors. To overcome these challenges, we introduce a model repair module that builds on an interpretable failure attribution pipeline. Our approach uses a conditional text-to-image model to generate semantically faithful and targeted images for failure cases. To preserve the quality and relevance of the generated samples, we further employ a large vision-language model (LVLM) to filter the outputs, enforcing alignment with the original data distribution and maintaining semantic consistency. By retraining vision models with this rare-case-augmented synthetic dataset, we significantly reduce errors associated with rare cases. Our experiments demonstrate that this targeted repair strategy improves model robustness without introducing new bugs. Code is available at https://github.com/oxu2/SafeFix

Paper Structure

This paper contains 35 sections, 6 equations, 6 figures, 13 tables.

Figures (6)

  • Figure 1: Overview of SafeFix. We propose a targeted model repair pipeline that identifies rare-case failures, generates attribute-specific synthetic images using a conditional diffusion model, filters them via a large vision--language model, and retrains the model to improve accuracy and fix rare-case bugs.
  • Figure 2: Comparison of generated images from different methods with edited attributes red hair, brown skin, and sad emotion. HiBug often produces invalid or imprecise samples due to the lack of conditional generation and semantic filtering. In contrast, SafeFix generates attribute-faithful images that specifically target rare-case bugs.
  • Figure 3: Accuracy comparison of ResNet-18 trained using different augmentation methods. The dashed line represents the average overall accuracy without additional synthetic training data.
  • Figure 4: Test accuracy(%) improvements for red-hair vs. yellow-hair augmentation on CLIP.
  • Figure 5: Comparison between original CelebA images (left) and attribute-edited outputs (right) generated by SafeFix.
  • ...and 1 more figures