Table of Contents
Fetching ...

Cell-Type Prototype-Informed Neural Network for Gene Expression Estimation from Pathology Images

Kazuya Nishimura, Ryoma Bise, Shinnosuke Matsuo, Haruka Hirose, Yasuhiro Kojima

Abstract

Estimating slide- and patch-level gene expression profiles from pathology images enables rapid and low-cost molecular analysis with broad clinical impact. Despite strong results, existing approaches treat gene expression as a mere slide- or spot-level signal and do not incorporate the fact that the measured expression arises from the aggregation of underlying cell-level expression. To explicitly introduce this missing cell-resolved guidance, we propose a Cell-type Prototype-informed Neural Network (CPNN) that leverages publicly available single-cell RNA-sequencing datasets. Since single-cell measurements are noisy and not paired with histology images, we first estimate cell-type prototypes-mean expression profiles that reflect stable gene-gene co-variation patterns.CPNN then learns cell-type compositional weights directly from images and models the relationship between prototypes and observed bulk or spatial expression, providing a biologically grounded and structurally regularized prediction framework. We evaluate CPNN on three slide-level datasets and three patch-level spatial transcriptomics datasets. Across all settings, CPNN achieves the highest performance in terms of Spearman correlation. Moreover, by visualizing the inferred compositional weights, our framework provides interpretable insights into which cell types drive the predicted expression. Code is publicly available at https://github.com/naivete5656/CPNN.

Cell-Type Prototype-Informed Neural Network for Gene Expression Estimation from Pathology Images

Abstract

Estimating slide- and patch-level gene expression profiles from pathology images enables rapid and low-cost molecular analysis with broad clinical impact. Despite strong results, existing approaches treat gene expression as a mere slide- or spot-level signal and do not incorporate the fact that the measured expression arises from the aggregation of underlying cell-level expression. To explicitly introduce this missing cell-resolved guidance, we propose a Cell-type Prototype-informed Neural Network (CPNN) that leverages publicly available single-cell RNA-sequencing datasets. Since single-cell measurements are noisy and not paired with histology images, we first estimate cell-type prototypes-mean expression profiles that reflect stable gene-gene co-variation patterns.CPNN then learns cell-type compositional weights directly from images and models the relationship between prototypes and observed bulk or spatial expression, providing a biologically grounded and structurally regularized prediction framework. We evaluate CPNN on three slide-level datasets and three patch-level spatial transcriptomics datasets. Across all settings, CPNN achieves the highest performance in terms of Spearman correlation. Moreover, by visualizing the inferred compositional weights, our framework provides interpretable insights into which cell types drive the predicted expression. Code is publicly available at https://github.com/naivete5656/CPNN.
Paper Structure (20 sections, 11 equations, 6 figures, 7 tables)

This paper contains 20 sections, 11 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: (a) Gene expression is observed together with histology image as the sum of unobserved cell-level expressions. Expression of individual cells is not directly measurable. (b) Cell-level gene expression profiles are obtained from a public single-cell RNA sequencing (scRNA-seq) database, which contains noisy measurements without corresponding image information.
  • Figure 2: Illustration of concept of Cell-type Prototype-informed Neural Network (CPNN). We estimate gene expression based on cell-type prototypes by modeling the relationship between observed gene expression and cell-level gene expression.
  • Figure 3: Overview of our method. Gene expression is estimated based on cell-type prototype $\bar{T}$ and weight $w (\mathbf{x}_i^{(n)} )$. The prototype supports the integration of covariance of gene expression among cells.
  • Figure 4: Average weight for each subtype of BRCA slides. We visualized the estimated weights for the cancer-related prototypes.
  • Figure 5: Example of estimated proportion.
  • ...and 1 more figures