Table of Contents
Fetching ...

scFusionTTT: Single-cell transcriptomics and proteomics fusion with Test-Time Training layers

Dian Meng, Bohao Xing, Xinlei Huang, Yanran Liu, Yijun Zhou, Yongjun xiao, Zitong Yu, Xubin Zheng

TL;DR

This paper proposes scFusionTTT, a novel method for Single-Cell multimodal omics Fusion with TTT-based masked autoencoder that combines the order information of genes and proteins in the human genome with the TTT layer, fuse multimodal omics, and enhance unimodal omics analysis.

Abstract

Single-cell multi-omics (scMulti-omics) refers to the paired multimodal data, such as Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq), where the regulation of each cell was measured from different modalities, i.e. genes and proteins. scMulti-omics can reveal heterogeneity inside tumors and understand the distinct genetic properties of diverse cell types, which is crucial to targeted therapy. Currently, deep learning methods based on attention structures in the bioinformatics area face two challenges. The first challenge is the vast number of genes in a single cell. Traditional attention-based modules struggled to effectively leverage all gene information due to their limited capacity for long-context learning and high-complexity computing. The second challenge is that genes in the human genome are ordered and influence each other's expression. Most of the methods ignored this sequential information. The recently introduced Test-Time Training (TTT) layer is a novel sequence modeling approach, particularly suitable for handling long contexts like genomics data because TTT layer is a linear complexity sequence modeling structure and is better suited to data with sequential relationships. In this paper, we propose scFusionTTT, a novel method for Single-Cell multimodal omics Fusion with TTT-based masked autoencoder. Of note, we combine the order information of genes and proteins in the human genome with the TTT layer, fuse multimodal omics, and enhance unimodal omics analysis. Finally, the model employs a three-stage training strategy, which yielded the best performance across most metrics in four multimodal omics datasets and four unimodal omics datasets, demonstrating the superior performance of our model. The dataset and code will be available on https://github.com/DM0815/scFusionTTT.

scFusionTTT: Single-cell transcriptomics and proteomics fusion with Test-Time Training layers

TL;DR

This paper proposes scFusionTTT, a novel method for Single-Cell multimodal omics Fusion with TTT-based masked autoencoder that combines the order information of genes and proteins in the human genome with the TTT layer, fuse multimodal omics, and enhance unimodal omics analysis.

Abstract

Single-cell multi-omics (scMulti-omics) refers to the paired multimodal data, such as Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq), where the regulation of each cell was measured from different modalities, i.e. genes and proteins. scMulti-omics can reveal heterogeneity inside tumors and understand the distinct genetic properties of diverse cell types, which is crucial to targeted therapy. Currently, deep learning methods based on attention structures in the bioinformatics area face two challenges. The first challenge is the vast number of genes in a single cell. Traditional attention-based modules struggled to effectively leverage all gene information due to their limited capacity for long-context learning and high-complexity computing. The second challenge is that genes in the human genome are ordered and influence each other's expression. Most of the methods ignored this sequential information. The recently introduced Test-Time Training (TTT) layer is a novel sequence modeling approach, particularly suitable for handling long contexts like genomics data because TTT layer is a linear complexity sequence modeling structure and is better suited to data with sequential relationships. In this paper, we propose scFusionTTT, a novel method for Single-Cell multimodal omics Fusion with TTT-based masked autoencoder. Of note, we combine the order information of genes and proteins in the human genome with the TTT layer, fuse multimodal omics, and enhance unimodal omics analysis. Finally, the model employs a three-stage training strategy, which yielded the best performance across most metrics in four multimodal omics datasets and four unimodal omics datasets, demonstrating the superior performance of our model. The dataset and code will be available on https://github.com/DM0815/scFusionTTT.

Paper Structure

This paper contains 22 sections, 19 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: (A)Adding more modalities can aid cellular analysis by providing separate information from various omics. (B) Comparison of complexity and computation between attention and TTTlayer when conducting single-cell multi-omics fusion.
  • Figure 2: Overview of scFusionTTT. (A) The overall model consists of three stages to learn the latent representations of cells of multimodal omics and unimodal omics. (B) The input of scFusionTTT consists of expression embedding and symbol embedding.
  • Figure 3: (A) TTT Block. (B) TTT Layer with CITE-seq. (C) Pipeline of hidden state update.
  • Figure 4: (A) Comparison of UMAP visualization on PBMC10K CITE-seq dataset across different methods. (B) Comparison of UMAP visualization on CBMC transcriptomics dataset across different methods.
  • Figure 5: Performance of different fusion modules and without fusion mechanism evaluated by ARI, NMI, and FMI. ARI: adjusted rand index, NMI: normalized mutual information, FMI: Fowlkes-Mallows index.
  • ...and 1 more figures