Towards Continual Expansion of Data Coverage: Automatic Text-guided Edge-case Synthesis
Kyeongryeol Go
TL;DR
This work tackles dataset bias by automating edge-case data synthesis through caption-based prompting. It introduces a pipeline that uses a preference-tuned LLM and Direct Preference Optimization to generate edge-focused prompts for a text-to-image model, guided by edge-ness measured via a pre-trained detector and pseudo-labeler. The approach augments the training data iteratively to expand coverage of challenging scenarios, demonstrated on FishEye8K where it improves robustness beyond naive or manually engineered prompts and transfers across model scales. The results suggest a scalable, data-centric path toward more reliable vision systems.
Abstract
The performance of deep neural networks is strongly influenced by the quality of their training data. However, mitigating dataset bias by manually curating challenging edge cases remains a major bottleneck. To address this, we propose an automated pipeline for text-guided edge-case synthesis. Our approach employs a Large Language Model, fine-tuned via preference learning, to rephrase image captions into diverse textual prompts that steer a Text-to-Image model toward generating difficult visual scenarios. Evaluated on the FishEye8K object detection benchmark, our method achieves superior robustness, surpassing both naive augmentation and manually engineered prompts. This work establishes a scalable framework that shifts data curation from manual effort to automated, targeted synthesis, offering a promising direction for developing more reliable and continuously improving AI systems. Code is available at https://github.com/gokyeongryeol/ATES.
