Dual-Path Knowledge-Augmented Contrastive Alignment Network for Spatially Resolved Transcriptomics
Wei Zhang, Jiajun Chu, Xinci Liu, Chen Tong, Xinyue Li
TL;DR
This work tackles spatial gene expression prediction from histology by introducing DKAN, a dual-path knowledge-augmented contrastive alignment network. It integrates high-level gene semantics—sourced from external databases and encoded via LLMs and BioBERT—with multi-scale image embeddings through a one-stage contrastive-plus-supervised objective, eliminating reliance on exemplar retrieval. Empirical results across three public ST datasets show DKAN achieving state-of-the-art accuracy and robust generalization, supported by ablations and visualizations of cancer biomarkers. The approach enhances interpretability by grounding multimodal fusion in biological semantics, accelerating biomarker discovery and tissue microenvironment analysis.
Abstract
Spatial Transcriptomics (ST) is a technology that measures gene expression profiles within tissue sections while retaining spatial context. It reveals localized gene expression patterns and tissue heterogeneity, both of which are essential for understanding disease etiology. However, its high cost has driven efforts to predict spatial gene expression from whole slide images. Despite recent advancements, current methods still face significant limitations, such as under-exploitation of high-level biological context, over-reliance on exemplar retrievals, and inadequate alignment of heterogeneous modalities. To address these challenges, we propose DKAN, a novel Dual-path Knowledge-Augmented contrastive alignment Network that predicts spatially resolved gene expression by integrating histopathological images and gene expression profiles through a biologically informed approach. Specifically, we introduce an effective gene semantic representation module that leverages the external gene database to provide additional biological insights, thereby enhancing gene expression prediction. Further, we adopt a unified, one-stage contrastive learning paradigm, seamlessly combining contrastive learning and supervised learning to eliminate reliance on exemplars, complemented with an adaptive weighting mechanism. Additionally, we propose a dual-path contrastive alignment module that employs gene semantic features as dynamic cross-modal coordinators to enable effective heterogeneous feature integration. Through extensive experiments across three public ST datasets, DKAN demonstrates superior performance over state-of-the-art models, establishing a new benchmark for spatial gene expression prediction and offering a powerful tool for advancing biological and clinical research.
