Table of Contents
Fetching ...

Vicinity-Guided Discriminative Latent Diffusion for Privacy-Preserving Domain Adaptation

Jing Wang, Wonho Bae, Jiahong Chen, Wenxu Wang, Junhyug Noh

TL;DR

This work tackles privacy-preserving domain adaptation by reusing latent diffusion models to transfer discriminative knowledge without exposing source data. It proposes Discriminative Vicinity Diffusion (DVD), which encodes source labels into latent vicinities around source embeddings and trains a drift function to move noisy latent samples toward label-consistent source manifolds. At target adaptation, a Gaussian-prior over target latent vicinities guides a frozen diffusion module to generate source-like cues, which are aligned with the target encoder using a contrastive objective, with additional SiLGA blending to stabilize target alignment. DVD delivers state-of-the-art results on SFDA benchmarks, improves in-domain source accuracy through latent augmentation, and enhances domain generalization, all while maintaining strict privacy constraints. The approach reframes latent diffusion as a practical mechanism for explicit cross-domain knowledge transfer, offering a scalable, efficient, and interpretable privacy-preserving alternative to traditional data-sharing.

Abstract

Recent work on latent diffusion models (LDMs) has focused almost exclusively on generative tasks, leaving their potential for discriminative transfer largely unexplored. We introduce Discriminative Vicinity Diffusion (DVD), a novel LDM-based framework for a more practical variant of source-free domain adaptation (SFDA): the source provider may share not only a pre-trained classifier but also an auxiliary latent diffusion module, trained once on the source data and never exposing raw source samples. DVD encodes each source feature's label information into its latent vicinity by fitting a Gaussian prior over its k-nearest neighbors and training the diffusion network to drift noisy samples back to label-consistent representations. During adaptation, we sample from each target feature's latent vicinity, apply the frozen diffusion module to generate source-like cues, and use a simple InfoNCE loss to align the target encoder to these cues, explicitly transferring decision boundaries without source access. Across standard SFDA benchmarks, DVD outperforms state-of-the-art methods. We further show that the same latent diffusion module enhances the source classifier's accuracy on in-domain data and boosts performance in supervised classification and domain generalization experiments. DVD thus reinterprets LDMs as practical, privacy-preserving bridges for explicit knowledge transfer, addressing a core challenge in source-free domain adaptation that prior methods have yet to solve.

Vicinity-Guided Discriminative Latent Diffusion for Privacy-Preserving Domain Adaptation

TL;DR

This work tackles privacy-preserving domain adaptation by reusing latent diffusion models to transfer discriminative knowledge without exposing source data. It proposes Discriminative Vicinity Diffusion (DVD), which encodes source labels into latent vicinities around source embeddings and trains a drift function to move noisy latent samples toward label-consistent source manifolds. At target adaptation, a Gaussian-prior over target latent vicinities guides a frozen diffusion module to generate source-like cues, which are aligned with the target encoder using a contrastive objective, with additional SiLGA blending to stabilize target alignment. DVD delivers state-of-the-art results on SFDA benchmarks, improves in-domain source accuracy through latent augmentation, and enhances domain generalization, all while maintaining strict privacy constraints. The approach reframes latent diffusion as a practical mechanism for explicit cross-domain knowledge transfer, offering a scalable, efficient, and interpretable privacy-preserving alternative to traditional data-sharing.

Abstract

Recent work on latent diffusion models (LDMs) has focused almost exclusively on generative tasks, leaving their potential for discriminative transfer largely unexplored. We introduce Discriminative Vicinity Diffusion (DVD), a novel LDM-based framework for a more practical variant of source-free domain adaptation (SFDA): the source provider may share not only a pre-trained classifier but also an auxiliary latent diffusion module, trained once on the source data and never exposing raw source samples. DVD encodes each source feature's label information into its latent vicinity by fitting a Gaussian prior over its k-nearest neighbors and training the diffusion network to drift noisy samples back to label-consistent representations. During adaptation, we sample from each target feature's latent vicinity, apply the frozen diffusion module to generate source-like cues, and use a simple InfoNCE loss to align the target encoder to these cues, explicitly transferring decision boundaries without source access. Across standard SFDA benchmarks, DVD outperforms state-of-the-art methods. We further show that the same latent diffusion module enhances the source classifier's accuracy on in-domain data and boosts performance in supervised classification and domain generalization experiments. DVD thus reinterprets LDMs as practical, privacy-preserving bridges for explicit knowledge transfer, addressing a core challenge in source-free domain adaptation that prior methods have yet to solve.

Paper Structure

This paper contains 67 sections, 15 equations, 6 figures, 22 tables, 2 algorithms.

Figures (6)

  • Figure 1: Framework overview. In source pre-training, our DVD aligns source features for consistent predictions within their vicinities. In target adaptation, the target latent vicinity guides DVD to generate features from nearby source latent vicinities based on their similarities. The parameterization of diffusion priors enables this guidance across domains. After these training phases, inference requires only the classifier $F \circ G_t$, without invoking $D$.
  • Figure 2: t-SNE of target features on VisDA-C 2017.
  • Figure 3: (Best viewed in color.) Hyperparameter sensitivity analysis. The results demonstrate the robustness of our DVD across a range of different hyperparameter settings.
  • Figure 4: Graphical Motivation. A binary classification illustration of DVD-based knowledge transfer without source data (blue: class 0, green: class 1). (1) DVD Training: A prior density is defined using the $k$-NN latent vicinity of each source sample, and a pre-trained source classifier "diffuses" ground-truth labels within that neighborhood. (2) DVD Sampling for Target Adaptation: With DVD and the classifier frozen, only the target encoder is updated via contrastive learning. We apply Source-Informed Latent Geometry Aggregation (SiLGA) to blend DVD-generated features with local target neighbors, thereby aligning target samples to source decision boundaries.
  • Figure 5: The directed graphical model illustrates how DVD enables explicit knowledge transfer by leveraging latent vicinity similarities between the two domains. Solid lines denote the direct causal relations between variables, which include the encoder, the latent diffusion, and the classifier. The dashed lines denote the stochastic sampling from a prior parameterized by the latent vicinity.
  • ...and 1 more figures