Table of Contents
Fetching ...

Injecting Hierarchical Biological Priors into Graph Neural Networks for Flow Cytometry Prediction

Fatemeh Nassajian Mojarrad, Lorenzo Bini, Thomas Matthes, Stéphane Marchand-Maillet

TL;DR

This work tackles leaf-cell classification in flow cytometry by integrating biological hierarchy as inductive bias within graph neural networks. The proposed FCHC-GNN couples an Early Module with a Max Constraint Module (MCM) and a hierarchical max-constrained loss (MCLoss) to enforce ancestor–descendant consistency, enabling accurate, hierarchy-respecting predictions across cell types. Empirical results on 19 patients show substantial gains over flat GNNs and DNNs, with stronger improvements as hierarchy depth increases, and the approach offers interpretability via feature importance and visualization. The method demonstrates the value of domain-aligned inductive biases for complex biological prediction tasks and can be extended to other hierarchical classification problems beyond flow cytometry.

Abstract

In the complex landscape of hematologic samples such as peripheral blood or bone marrow derived from flow cytometry (FC) data, cell-level prediction presents profound challenges. This work explores injecting hierarchical prior knowledge into graph neural networks (GNNs) for single-cell multi-class classification of tabular cellular data. By representing the data as graphs and encoding hierarchical relationships between classes, we propose our hierarchical plug-in method to be applied to several GNN models, namely, FCHC-GNN, and effectively designed to capture neighborhood information crucial for single-cell FC domain. Extensive experiments on our cohort of 19 distinct patients, demonstrate that incorporating hierarchical biological constraints boosts performance significantly across multiple metrics compared to baseline GNNs without such priors. The proposed approach highlights the importance of structured inductive biases for gaining improved generalization in complex biological prediction tasks.

Injecting Hierarchical Biological Priors into Graph Neural Networks for Flow Cytometry Prediction

TL;DR

This work tackles leaf-cell classification in flow cytometry by integrating biological hierarchy as inductive bias within graph neural networks. The proposed FCHC-GNN couples an Early Module with a Max Constraint Module (MCM) and a hierarchical max-constrained loss (MCLoss) to enforce ancestor–descendant consistency, enabling accurate, hierarchy-respecting predictions across cell types. Empirical results on 19 patients show substantial gains over flat GNNs and DNNs, with stronger improvements as hierarchy depth increases, and the approach offers interpretability via feature importance and visualization. The method demonstrates the value of domain-aligned inductive biases for complex biological prediction tasks and can be extended to other hierarchical classification problems beyond flow cytometry.

Abstract

In the complex landscape of hematologic samples such as peripheral blood or bone marrow derived from flow cytometry (FC) data, cell-level prediction presents profound challenges. This work explores injecting hierarchical prior knowledge into graph neural networks (GNNs) for single-cell multi-class classification of tabular cellular data. By representing the data as graphs and encoding hierarchical relationships between classes, we propose our hierarchical plug-in method to be applied to several GNN models, namely, FCHC-GNN, and effectively designed to capture neighborhood information crucial for single-cell FC domain. Extensive experiments on our cohort of 19 distinct patients, demonstrate that incorporating hierarchical biological constraints boosts performance significantly across multiple metrics compared to baseline GNNs without such priors. The proposed approach highlights the importance of structured inductive biases for gaining improved generalization in complex biological prediction tasks.
Paper Structure (20 sections, 2 theorems, 7 equations, 5 figures, 9 tables)

This paper contains 20 sections, 2 theorems, 7 equations, 5 figures, 9 tables.

Key Result

Theorem 4.1

Let $\mathbf{x}\in \mathbb{R}^m$ be a data point. Let $\mathcal{C}=\{A_1,\cdots, A_C\}$ be a set of hierarchically structured classes and let $\mathcal{H}$ be a early module with outputs $\mathcal{H}_{A_1},\cdots, \mathcal{H}_{A_C}$ ($\mathcal{H}_{A_c}\in[0,1]~\forall c$) given the input $\mathbf{x}

Figures (5)

  • Figure 1: Depiction of our HC set up.
  • Figure 2: Feature importance for FCHC-GAT attributed by $\mathcal{H}$. Feature labels correspond to that of Table \ref{['tab:1']}.
  • Figure 3: (a): Computing the normalized attention coefficients $\gamma_{ij}$ (b): Multi-head attention of node 1 on its neighborhood. Arrows show concatenation or averaging of attention (adapted from Velickovic2017).
  • Figure 4: Depiction of the shallow FC setup.
  • Figure 5: Embeddings t-SNE projection for patient 7, for the FCHC-GAT module.

Theorems & Definitions (2)

  • Theorem 4.1
  • Proposition 4.2