Hierarchical Generalized Category Discovery for Brain Tumor Classification in Digital Pathology
Matthias Perkonigg, Patrick Rockenschaub, Georg Göbel, Adelheid Wöhrer
TL;DR
This work tackles the problem of brain tumor classification when not all subtypes are labeled by leveraging Generalized Category Discovery with a respect for hierarchical tumor taxonomy. HGCD-BT combines a contrastive learning branch with a semi-supervised hierarchical clustering loss implemented via a soft binary decision tree, enabling simultaneous learning of known and novel categories. Evaluations on Stimulated Raman Histology (OpenSRH) and H&E-stained whole-slide images (DBTA) show superior patch-level and slide-level performance, with robust discovery of unseen tumor categories and cross-modality generalization. The approach promises faster, more interpretable intraoperative guidance and a pathway toward continual, hierarchy-aware discovery in biomedical imaging.
Abstract
Accurate brain tumor classification is critical for intra-operative decision making in neuro-oncological surgery. However, existing approaches are restricted to a fixed set of predefined classes and are therefore unable to capture patterns of tumor types not available during training. Unsupervised learning can extract general-purpose features, but it lacks the ability to incorporate prior knowledge from labelled data, and semi-supervised methods often assume that all potential classes are represented in the labelled data. Generalized Category Discovery (GCD) aims to bridge this gap by categorizing both known and unknown classes within unlabelled data. To reflect the hierarchical structure of brain tumor taxonomies, in this work, we introduce Hierarchical Generalized Category Discovery for Brain Tumor Classification (HGCD-BT), a novel approach that integrates hierarchical clustering with contrastive learning. Our method extends contrastive learning based GCD by incorporating a novel semi-supervised hierarchical clustering loss. We evaluate HGCD-BT on OpenSRH, a dataset of stimulated Raman histology brain tumor images, achieving a +28% improvement in accuracy over state-of-the-art GCD methods for patch-level classification, particularly in identifying previously unseen tumor categories. Furthermore, we demonstrate the generalizability of HGCD-BT on slide-level classification of hematoxylin and eosin stained whole-slide images from the Digital Brain Tumor Atlas, confirming its utility across imaging modalities.
