Learning Conditional Invariances through Non-Commutativity
Abhra Chaudhuri, Serban Georgescu, Anjan Dutta
TL;DR
This work tackles domain adaptation under asymmetry, where the target domain holds semantically valuable information not present in the source. It proposes Non-Commutative Invariance (NCI), a conditional invariance framework that directs the invariance operator toward the target domain, preserving target-relevant features while leveraging source-domain information as augmentations. The authors prove that NCI yields stricter target-risk bounds by driving the $\mathcal{H}_\eta$-divergence to zero and show that source samples can match the target encoder's sample complexity via Haussler-type bounds, effectively learning $\Phi^*_{\tau}$ with cross-domain data. Empirically, NCI achieves state-of-the-art results on PACS, Office-Home, and DomainNet and approaches oracle performance in multimodal segmentation, with gains amplified when source domains provide complementary semantic information. Overall, NCI offers a principled, efficient path to target-domain optimality by exploiting domain asymmetry and cross-domain semantics. $I(\cdot;\cdot)$ and $\mathcal{H}_\eta$-divergence play central roles in formalizing the bounds and the learning dynamics. $\varphi^*_{\tau}$ and $\Phi^*_{\tau}$ denote target-optimal encoders under different training regimes, illustrating the theoretical and practical advantages of directing invariances non-commutatively toward the target.
Abstract
Invariance learning algorithms that conditionally filter out domain-specific random variables as distractors, do so based only on the data semantics, and not the target domain under evaluation. We show that a provably optimal and sample-efficient way of learning conditional invariances is by relaxing the invariance criterion to be non-commutatively directed towards the target domain. Under domain asymmetry, i.e., when the target domain contains semantically relevant information absent in the source, the risk of the encoder $\varphi^*$ that is optimal on average across domains is strictly lower-bounded by the risk of the target-specific optimal encoder $Φ^*_τ$. We prove that non-commutativity steers the optimization towards $Φ^*_τ$ instead of $\varphi^*$, bringing the $\mathcal{H}$-divergence between domains down to zero, leading to a stricter bound on the target risk. Both our theory and experiments demonstrate that non-commutative invariance (NCI) can leverage source domain samples to meet the sample complexity needs of learning $Φ^*_τ$, surpassing SOTA invariance learning algorithms for domain adaptation, at times by over $2\%$, approaching the performance of an oracle. Implementation is available at https://github.com/abhrac/nci.
