SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning
Seokju Yun, Seunghye Chae, Dongheon Lee, Youngmin Ro
TL;DR
Domain generalization (DG) in vision with vision foundation models (VFMs) remains challenging due to domain shifts. SoMA introduces a spectral, parameter-efficient fine-tuning approach that leverages singular value decomposition (SVD) to identify and preserve generalizable components while tuning only minor singular components via a low-rank adapter, coupled with freezing of early blocks and an annealing weight decay schedule. The method yields state-of-the-art results on DG for semantic segmentation (DGSS) and object detection (DGOD), with no inference overhead and strong data/model scalability, and extends to subject personalization in generative settings. Collectively, SoMA demonstrates how spectral structure and block-level dynamics of VFMs can be exploited to achieve robust, efficient domain-generalizable representation learning across diverse tasks.
Abstract
Domain generalization (DG) aims to adapt a model using one or multiple source domains to ensure robust performance in unseen target domains. Recently, Parameter-Efficient Fine-Tuning (PEFT) of foundation models has shown promising results in the context of DG problem. Nevertheless, existing PEFT methods still struggle to strike a balance between preserving generalizable components of the pre-trained model and learning task-specific features. To gain insights into the distribution of generalizable components, we begin by analyzing the pre-trained weights through the lens of singular value decomposition. Building on these insights, we introduce Singular Value Decomposed Minor Components Adaptation (SoMA), an approach that selectively tunes minor singular components while keeping the residual parts frozen. SoMA effectively retains the generalization ability of the pre-trained model while efficiently acquiring task-specific skills. Moreover, we freeze domain-generalizable blocks and employ an annealing weight decay strategy, thereby achieving an optimal balance in the delicate trade-off between generalizability and discriminability. SoMA attains state-of-the-art results on multiple benchmarks that span both domain generalized semantic segmentation to domain generalized object detection. In addition, our methods introduce no additional inference overhead or regularization loss, maintain compatibility with any backbone or head, and are designed to be versatile, allowing easy integration into a wide range of tasks.
