BatStyler: Advancing Multi-category Style Generation for Source-free Domain Generalization
Xiusheng Xu, Lei Qi, Jingyang Zhou, Xin Geng
TL;DR
Source-Free Domain Generalization (SFDG) requires generalization to unseen domains without source images. BatStyler introduces two modules—Coarse Semantic Generation (CSG) and Uniform Style Generation (USG)—within a CLIP-based framework to enlarge style diversity in multi-category tasks while respecting semantic structure. CSG reduces the effective semantic constraint by extracting $C$ coarse-grained semantics per cluster, and USG provides $K$ uniformly distributed style templates initialized via neural collapse, enabling parallel training with a fixed classifier. Experiments show BatStyler matches or exceeds state-of-the-art on multi-category benchmarks and remains competitive on less-category datasets, with improved efficiency and data synthesis diversity. The approach hinges on the CLIP joint space, suggesting future work on robustness to vision-language misalignment.
Abstract
Source-Free Domain Generalization (SFDG) aims to develop a model that performs on unseen domains without relying on any source domains. However, the implementation remains constrained due to the unavailability of training data. Research on SFDG focus on knowledge transfer of multi-modal models and style synthesis based on joint space of multiple modalities, thus eliminating the dependency on source domain images. However, existing works primarily work for multi-domain and less-category configuration, but performance on multi-domain and multi-category configuration is relatively poor. In addition, the efficiency of style synthesis also deteriorates in multi-category scenarios. How to efficiently synthesize sufficiently diverse data and apply it to multi-category configuration is a direction with greater practical value. In this paper, we propose a method called BatStyler, which is utilized to improve the capability of style synthesis in multi-category scenarios. BatStyler consists of two modules: Coarse Semantic Generation and Uniform Style Generation modules. The Coarse Semantic Generation module extracts coarse-grained semantics to prevent the compression of space for style diversity learning in multi-category configuration, while the Uniform Style Generation module provides a template of styles that are uniformly distributed in space and implements parallel training. Extensive experiments demonstrate that our method exhibits comparable performance on less-category datasets, while surpassing state-of-the-art methods on multi-category datasets.
