Table of Contents
Fetching ...

DEDUCE: Multi-head attention decoupled contrastive learning to discover cancer subtypes based on multi-omics data

Liangrui Pan, Xiang Wang, Qingchun Liang, Jiandong Shang, Wenjuan Liu, Liwen Xu, Shaoliang Peng

TL;DR

The proposed DEDUCE model learns features from multi-omics data through SMAE, and the subtype decoupled contrastive learning consistently optimizes the model for clustering and identifying cancer subtypes.

Abstract

Background and Objective: Given the high heterogeneity and clinical diversity of cancer, substantial variations exist in multi-omics data and clinical features across different cancer subtypes. Methods: We propose a model, named DEDUCE, based on a symmetric multi-head attention encoders (SMAE), for unsupervised contrastive learning to analyze multi-omics cancer data, with the aim of identifying and characterizing cancer subtypes. This model adopts a unsupervised SMAE that can deeply extract contextual features and long-range dependencies from multi-omics data, thereby mitigating the impact of noise. Importantly, DEDUCE introduces a subtype decoupled contrastive learning method based on a multi-head attention mechanism to simultaneously learn features from multi-omics data and perform clustering for identifying cancer subtypes. Subtypes are clustered by calculating the similarity between samples in both the feature space and sample space of multi-omics data. The fundamental concept involves decoupling various attributes of multi-omics data features and learning them as contrasting terms. A contrastive loss function is constructed to quantify the disparity between positive and negative examples, and the model minimizes this difference, thereby promoting the acquisition of enhanced feature representation. Results: The DEDUCE model undergoes extensive experiments on simulated multi-omics datasets, single-cell multi-omics datasets, and cancer multi-omics datasets, outperforming 10 deep learning models. The DEDUCE model outperforms state-of-the-art methods, and ablation experiments demonstrate the effectiveness of each module in the DEDUCE model. Finally, we applied the DEDUCE model to identify six cancer subtypes of AML.

DEDUCE: Multi-head attention decoupled contrastive learning to discover cancer subtypes based on multi-omics data

TL;DR

The proposed DEDUCE model learns features from multi-omics data through SMAE, and the subtype decoupled contrastive learning consistently optimizes the model for clustering and identifying cancer subtypes.

Abstract

Background and Objective: Given the high heterogeneity and clinical diversity of cancer, substantial variations exist in multi-omics data and clinical features across different cancer subtypes. Methods: We propose a model, named DEDUCE, based on a symmetric multi-head attention encoders (SMAE), for unsupervised contrastive learning to analyze multi-omics cancer data, with the aim of identifying and characterizing cancer subtypes. This model adopts a unsupervised SMAE that can deeply extract contextual features and long-range dependencies from multi-omics data, thereby mitigating the impact of noise. Importantly, DEDUCE introduces a subtype decoupled contrastive learning method based on a multi-head attention mechanism to simultaneously learn features from multi-omics data and perform clustering for identifying cancer subtypes. Subtypes are clustered by calculating the similarity between samples in both the feature space and sample space of multi-omics data. The fundamental concept involves decoupling various attributes of multi-omics data features and learning them as contrasting terms. A contrastive loss function is constructed to quantify the disparity between positive and negative examples, and the model minimizes this difference, thereby promoting the acquisition of enhanced feature representation. Results: The DEDUCE model undergoes extensive experiments on simulated multi-omics datasets, single-cell multi-omics datasets, and cancer multi-omics datasets, outperforming 10 deep learning models. The DEDUCE model outperforms state-of-the-art methods, and ablation experiments demonstrate the effectiveness of each module in the DEDUCE model. Finally, we applied the DEDUCE model to identify six cancer subtypes of AML.
Paper Structure (19 sections, 11 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 19 sections, 11 equations, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: Flowchart of DEDUCE model for unsupervised subtype decoupled contrastive learning based on SMAE.
  • Figure 2: C-index Silhouette score, and Davies Bouldin score of eleven unsupervised methods on simulated datasets. SS and RS represent two conditions, i.e., all clusters have the same size, and clusters have variable random sizes.
  • Figure 3: C-index, silhouette score, and Davies Bouldin score of eleven unsupervised methods on single-cell multi-omics datasets. Based on the single-cell dataset, clustering analysis was performed and three internal indicators, including C-index, silhouette score, and Davies Bouldin score, were calculated. The number of clusters was set to 3, and the k-means clustering algorithm was run over 1000 times.
  • Figure 4: C-index(a), Silhouette scores(b) and Davies Bouldin scores(c) of thirteen unsupervised methods on cancer benchmark datasets used in clustering task. Red represents the DEDUCE model.
  • Figure 5: Radar plot of C-index (a), Silhouette score (b), and Davies Bouldin score (c) obtained from ablation experiments in the cancer benchmark dataset. Red represents the SMAE.
  • ...and 4 more figures