Table of Contents
Fetching ...

ProteinPNet: Prototypical Part Networks for Concept Learning in Spatial Proteomics

Louis McConnell, Jieran Sun, Theo Maffei, Raphael Gottardo, Marianna Rapsomaniki

TL;DR

ProteinPNet introduces a prototypical part network tailored for spatial proteomics to discover interpretable TME motifs. It learns spatial prototypes directly from data, aligning them with tumor subtypes and enabling mechanistic interpretation via activation maps and downstream analyses. The method is validated on synthetic data and a NSCLC imaging mass cytometry dataset, showing robust classification and biologically meaningful motifs that reflect immune infiltration and tissue modularity. The work highlights prototype-based learning as a promising avenue for interpretable spatial biomarker discovery in spatial omics, while noting current limitations and future directions toward richer encoders and higher-channel data.

Abstract

Understanding the spatial architecture of the tumor microenvironment (TME) is critical to advance precision oncology. We present ProteinPNet, a novel framework based on prototypical part networks that discovers TME motifs from spatial proteomics data. Unlike traditional post-hoc explanability models, ProteinPNet directly learns discriminative, interpretable, faithful spatial prototypes through supervised training. We validate our approach on synthetic datasets with ground truth motifs, and further test it on a real-world lung cancer spatial proteomics dataset. ProteinPNet consistently identifies biologically meaningful prototypes aligned with different tumor subtypes. Through graphical and morphological analyses, we show that these prototypes capture interpretable features pointing to differences in immune infiltration and tissue modularity. Our results highlight the potential of prototype-based learning to reveal interpretable spatial biomarkers within the TME, with implications for mechanistic discovery in spatial omics.

ProteinPNet: Prototypical Part Networks for Concept Learning in Spatial Proteomics

TL;DR

ProteinPNet introduces a prototypical part network tailored for spatial proteomics to discover interpretable TME motifs. It learns spatial prototypes directly from data, aligning them with tumor subtypes and enabling mechanistic interpretation via activation maps and downstream analyses. The method is validated on synthetic data and a NSCLC imaging mass cytometry dataset, showing robust classification and biologically meaningful motifs that reflect immune infiltration and tissue modularity. The work highlights prototype-based learning as a promising avenue for interpretable spatial biomarker discovery in spatial omics, while noting current limitations and future directions toward richer encoders and higher-channel data.

Abstract

Understanding the spatial architecture of the tumor microenvironment (TME) is critical to advance precision oncology. We present ProteinPNet, a novel framework based on prototypical part networks that discovers TME motifs from spatial proteomics data. Unlike traditional post-hoc explanability models, ProteinPNet directly learns discriminative, interpretable, faithful spatial prototypes through supervised training. We validate our approach on synthetic datasets with ground truth motifs, and further test it on a real-world lung cancer spatial proteomics dataset. ProteinPNet consistently identifies biologically meaningful prototypes aligned with different tumor subtypes. Through graphical and morphological analyses, we show that these prototypes capture interpretable features pointing to differences in immune infiltration and tissue modularity. Our results highlight the potential of prototype-based learning to reveal interpretable spatial biomarkers within the TME, with implications for mechanistic discovery in spatial omics.

Paper Structure

This paper contains 10 sections, 16 figures, 1 table.

Figures (16)

  • Figure 1: The ProteinPNet workflow. During prototype discovery, the prototype vectors are randomly initialized and projected onto the closest patch representation. The prototype representations are then convolved over the representation of the spatial proteomics image with a cosine similarity kernel to generate an activation heatmap, which generates a set of prototype scores that are linearly combined to make the final prediction. During prototype interpretation, prototypes that generated the highest accuracy are analyzed in terms of their morphological and compositional characteristics.
  • Figure 2: (a) Characteristic examples of LUSC and LUAD prototypes, collected across many runs with only one prototype per class, together with the source image and activation maps. (b) Performance of three graph explainers for the same example samples.
  • Figure S1: The two classes present in the synthetic dataset, with the red circle outlining the class-defining prototypes. Class independent prototypes can be seen in both samples.
  • Figure S2: Synthetic data activations.
  • Figure S3: Example of prototypes extracted from the synthetic dataset as above. These are class specific prototypes, indicating that each of the top two prototypes belong to the first class and the bottom two belong to the second class. One can see that the second prototype contains an occlusion of a neutral prototype over the classifying prototype. In order to demonstrate the lack of the classifying prototype in the other class, the model focuses on the white space present in the model.
  • ...and 11 more figures