The Importance of Model Inspection for Better Understanding Performance Characteristics of Graph Neural Networks
Nairouz Shehata, Carolina Piçarra, Anees Kazi, Ben Glocker
TL;DR
The paper addresses the risk that final-test accuracy alone hides biases and biases in feature learning when using graph neural networks on brain-shape data. It proposes and applies a model-inspection framework that extracts embeddings from the GCN submodel and classifier layers across two architecture variants (shared vs. structure-specific subgraphs) and with/without mesh registration. The authors show that while ROC-AUC differences are modest, the learned feature spaces reveal data-source encoding and task-relevant separability that depend on architectural choices and preprocessing steps, underscoring the need for inspection beyond accuracy. This approach improves understanding of what drives predictions, informs model selection, and has practical implications for transfer learning and domain adaptation in biomedical imaging.
Abstract
This study highlights the importance of conducting comprehensive model inspection as part of comparative performance analyses. Here, we investigate the effect of modelling choices on the feature learning characteristics of graph neural networks applied to a brain shape classification task. Specifically, we analyse the effect of using parameter-efficient, shared graph convolutional submodels compared to structure-specific, non-shared submodels. Further, we assess the effect of mesh registration as part of the data harmonisation pipeline. We find substantial differences in the feature embeddings at different layers of the models. Our results highlight that test accuracy alone is insufficient to identify important model characteristics such as encoded biases related to data source or potentially non-discriminative features learned in submodels. Our model inspection framework offers a valuable tool for practitioners to better understand performance characteristics of deep learning models in medical imaging.
