HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis
Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Juming Xiong, Shunxing Bao, Hao Li, Mengmeng Yin, Yu Wang, Shilin Zhao, Yucheng Tang, Haichun Yang, Yuankai Huo
TL;DR
The paper tackles panoramic segmentation of kidney pathology by introducing Hierarchical Adaptive Taxonomy Segmentation (HATs), which encodes anatomical relationships across regions, functional units, and cells using a hierarchical taxonomy matrix $M_t \in \mathbb{R}^{n\times n}$ and a hierarchical scale matrix informed by area-rate knowledge. A token-based EfficientSAM with a dynamic token bank stores class-aware tokens $T_c \in \mathbb{R}^{n\times d}$ and scale tokens $T_s \in \mathbb{R}^{4\times d}$ to enable weak-prompt semantic segmentation and cross-scale awareness, with a taxonomy loss $L_{hats}$ guiding relationships and a scale-weighted loss $S$. The method is validated on a 15-class kidney dataset compiled from Regions, Functional Units, and Cells across NEPTUNE, HuBMAP, and nephrectomy sources, using a two-phase training regime and 512×512 patches; results show superior Dice scores compared to multiple baselines and demonstrate the benefits of the hierarchical matrices and token-based dynamics. The work provides an open-source implementation and points to future enhancements by hybridizing CNN and transformer backbones to further improve segmentation across scales.
Abstract
Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy. For instance, the intricate organization in kidney pathology spans multiple layers, from regions like the cortex and medulla to functional units such as glomeruli, tubules, and vessels, down to various cell types. In this paper, we propose a novel Hierarchical Adaptive Taxonomy Segmentation (HATs) method, which is designed to thoroughly segment panoramic views of kidney structures by leveraging detailed anatomical insights. Our approach entails (1) the innovative HATs technique which translates spatial relationships among 15 distinct object classes into a versatile "plug-and-play" loss function that spans across regions, functional units, and cells, (2) the incorporation of anatomical hierarchies and scale considerations into a unified simple matrix representation for all panoramic entities, (3) the adoption of the latest AI foundation model (EfficientSAM) as a feature extraction tool to boost the model's adaptability, yet eliminating the need for manual prompt generation in conventional segment anything model (SAM). Experimental findings demonstrate that the HATs method offers an efficient and effective strategy for integrating clinical insights and imaging precedents into a unified segmentation model across more than 15 categories. The official implementation is publicly available at https://github.com/hrlblab/HATs.
