Table of Contents
Fetching ...

DP-DCAN: Differentially Private Deep Contrastive Autoencoder Network for Single-cell Clustering

Huifa Li, Jie Fu, Zhili Chen, Xiaomin Yang, Haitao Liu, Xinpeng Ling

TL;DR

DP-DCAN addresses privacy-preserving single-cell clustering by applying differential privacy to only the encoder gradient (partial network perturbation), leveraging a two-stage contrastive learning scheme to improve latent representations. The model uses a ZINB-based reconstruction loss and joint embedding-clustering objectives, achieving strong clustering performance under DP on eight real scRNA-seq datasets. A Rényi differential privacy analysis and empirical evaluations show DP-DCAN can outperform full-network perturbation baselines while requiring smaller privacy budgets, highlighting its practicality for privacy-sensitive genomics analyses. This approach offers a concrete, scalable path for deploying private deep learning models in single-cell genomics without sacrificing utility during clustering and downstream interpretation.

Abstract

Single-cell RNA sequencing (scRNA-seq) is important to transcriptomic analysis of gene expression. Recently, deep learning has facilitated the analysis of high-dimensional single-cell data. Unfortunately, deep learning models may leak sensitive information about users. As a result, Differential Privacy (DP) is increasingly used to protect privacy. However, existing DP methods usually perturb whole neural networks to achieve differential privacy, and hence result in great performance overheads. To address this challenge, in this paper, we take advantage of the uniqueness of the autoencoder that it outputs only the dimension-reduced vector in the middle of the network, and design a Differentially Private Deep Contrastive Autoencoder Network (DP-DCAN) by partial network perturbation for single-cell clustering. Since only partial network is added with noise, the performance improvement is obvious and twofold: one part of network is trained with less noise due to a bigger privacy budget, and the other part is trained without any noise. Experimental results of six datasets have verified that DP-DCAN is superior to the traditional DP scheme with whole network perturbation. Moreover, DP-DCAN demonstrates strong robustness to adversarial attacks.

DP-DCAN: Differentially Private Deep Contrastive Autoencoder Network for Single-cell Clustering

TL;DR

DP-DCAN addresses privacy-preserving single-cell clustering by applying differential privacy to only the encoder gradient (partial network perturbation), leveraging a two-stage contrastive learning scheme to improve latent representations. The model uses a ZINB-based reconstruction loss and joint embedding-clustering objectives, achieving strong clustering performance under DP on eight real scRNA-seq datasets. A Rényi differential privacy analysis and empirical evaluations show DP-DCAN can outperform full-network perturbation baselines while requiring smaller privacy budgets, highlighting its practicality for privacy-sensitive genomics analyses. This approach offers a concrete, scalable path for deploying private deep learning models in single-cell genomics without sacrificing utility during clustering and downstream interpretation.

Abstract

Single-cell RNA sequencing (scRNA-seq) is important to transcriptomic analysis of gene expression. Recently, deep learning has facilitated the analysis of high-dimensional single-cell data. Unfortunately, deep learning models may leak sensitive information about users. As a result, Differential Privacy (DP) is increasingly used to protect privacy. However, existing DP methods usually perturb whole neural networks to achieve differential privacy, and hence result in great performance overheads. To address this challenge, in this paper, we take advantage of the uniqueness of the autoencoder that it outputs only the dimension-reduced vector in the middle of the network, and design a Differentially Private Deep Contrastive Autoencoder Network (DP-DCAN) by partial network perturbation for single-cell clustering. Since only partial network is added with noise, the performance improvement is obvious and twofold: one part of network is trained with less noise due to a bigger privacy budget, and the other part is trained without any noise. Experimental results of six datasets have verified that DP-DCAN is superior to the traditional DP scheme with whole network perturbation. Moreover, DP-DCAN demonstrates strong robustness to adversarial attacks.
Paper Structure (25 sections, 2 theorems, 14 equations, 5 figures, 5 tables, 2 algorithms)

This paper contains 25 sections, 2 theorems, 14 equations, 5 figures, 5 tables, 2 algorithms.

Key Result

theorem thmcountertheorem

(Privacy loss of DP-DCAN). The privacy loss of DP-DCAN satisfies: where $0<\delta<1$, $R1$ and $R2$ is the RDP of algorithm DPAN which is computed by Theorems the:rdp-of-dpsgd.

Figures (5)

  • Figure 1: Privacy threat model of scRNA-seq data deep learning.
  • Figure 2: The overview of DP-DCAN.
  • Figure 3: Comparison of clustering results with 2D visualization by UMAP on the Muraro dataset.
  • Figure 4: Clustering performance of DP-DCAN and DPE-DCAN when $\sigma=2.00$.
  • Figure 5: Clustering performance of DP-DCAN and DPE-DCAN when $\epsilon=8.0$.

Theorems & Definitions (7)

  • definition thmcounterdefinition
  • theorem thmcountertheorem
  • definition thmcounterdefinition
  • definition thmcounterdefinition
  • theorem thmcountertheorem
  • definition thmcounterdefinition
  • definition thmcounterdefinition