Generative Active Learning for Long-tailed Instance Segmentation
Muzhi Zhu, Chengxiang Fan, Hao Chen, Yang Liu, Weian Mao, Xiaogang Xu, Chunhua Shen
TL;DR
This work tackles using unlimited, noisy generated data to improve long-tailed instance segmentation. It introduces BSGAL, a batched streaming generative active learning method that online-estimates each batch's contribution via a gradient-based signal and a momentum gradient cache, enabling effective filtering and utilization of generated data. Empirical results on CIFAR-10 (offline) and LVIS (online) show that selective use of generated data yields meaningful gains over unfiltered and CLIP-filtered baselines, with pronounced improvements for rare categories. The approach advances practical deployment by providing a scalable, data-diversity-preserving framework that bridges generative data with complex perception tasks.
Abstract
Recently, large-scale language-image generative models have gained widespread attention and many works have utilized generated data from these models to further enhance the performance of perception tasks. However, not all generated data can positively impact downstream models, and these methods do not thoroughly explore how to better select and utilize generated data. On the other hand, there is still a lack of research oriented towards active learning on generated data. In this paper, we explore how to perform active learning specifically for generated data in the long-tailed instance segmentation task. Subsequently, we propose BSGAL, a new algorithm that online estimates the contribution of the generated data based on gradient cache. BSGAL can handle unlimited generated data and complex downstream segmentation tasks effectively. Experiments show that BSGAL outperforms the baseline approach and effectually improves the performance of long-tailed segmentation. Our code can be found at https://github.com/aim-uofa/DiverGen.
