Table of Contents
Fetching ...

SALT: Introducing a Framework for Hierarchical Segmentations in Medical Imaging using Softmax for Arbitrary Label Trees

Sven Koitka, Giulia Baldini, Cynthia S. Schmidt, Olivia B. Pollok, Obioma Pelka, Judith Kohnke, Katarzyna Borys, Christoph M. Friedrich, Benedikt M. Schaarschmidt, Michael Forsting, Lale Umutlu, Johannes Haubold, Felix Nensa, René Hosch

TL;DR

Conventional CT segmentation often treats anatomical structures independently, neglecting hierarchical relationships. SALT introduces a Softmax for Arbitrary Label Trees, modeling conditional probabilities along a body-wide label tree and enabling a single DynUNet-based network to predict 113 leaf regions via a chain-rule-like probability formulation $P(y=c|x)=\prod_{i\in R_c} \frac{e^{-x_i}}{\sum_{k\in S_i} e^{-x_k}}$. Key contributions include dataset fusion to create a hierarchical label set, explicit tree matrices (A, R, S), a tree-aware encoding of labels for loss computation, and multi-dataset evaluation showing competitive Dice and NSD with fast whole-body inference. TheSAL T framework demonstrates potential for real-time clinical deployment, standardized hierarchical annotations, and applicability to other domains with hierarchical label structures, marking a step toward scalable, interpretable, and efficient medical image analysis.

Abstract

Traditional segmentation networks approach anatomical structures as standalone elements, overlooking the intrinsic hierarchical connections among them. This study introduces Softmax for Arbitrary Label Trees (SALT), a novel approach designed to leverage the hierarchical relationships between labels, improving the efficiency and interpretability of the segmentations. This study introduces a novel segmentation technique for CT imaging, which leverages conditional probabilities to map the hierarchical structure of anatomical landmarks, such as the spine's division into lumbar, thoracic, and cervical regions and further into individual vertebrae. The model was developed using the SAROS dataset from The Cancer Imaging Archive (TCIA), comprising 900 body region segmentations from 883 patients. The dataset was further enhanced by generating additional segmentations with the TotalSegmentator, for a total of 113 labels. The model was trained on 600 scans, while validation and testing were conducted on 150 CT scans. Performance was assessed using the Dice score across various datasets, including SAROS, CT-ORG, FLARE22, LCTSC, LUNA16, and WORD. Among the evaluated datasets, SALT achieved its best results on the LUNA16 and SAROS datasets, with Dice scores of 0.93 and 0.929 respectively. The model demonstrated reliable accuracy across other datasets, scoring 0.891 on CT-ORG and 0.849 on FLARE22. The LCTSC dataset showed a score of 0.908 and the WORD dataset also showed good performance with a score of 0.844. SALT used the hierarchical structures inherent in the human body to achieve whole-body segmentations with an average of 35 seconds for 100 slices. This rapid processing underscores its potential for integration into clinical workflows, facilitating the automatic and efficient computation of full-body segmentations with each CT scan, thus enhancing diagnostic processes and patient care.

SALT: Introducing a Framework for Hierarchical Segmentations in Medical Imaging using Softmax for Arbitrary Label Trees

TL;DR

Conventional CT segmentation often treats anatomical structures independently, neglecting hierarchical relationships. SALT introduces a Softmax for Arbitrary Label Trees, modeling conditional probabilities along a body-wide label tree and enabling a single DynUNet-based network to predict 113 leaf regions via a chain-rule-like probability formulation . Key contributions include dataset fusion to create a hierarchical label set, explicit tree matrices (A, R, S), a tree-aware encoding of labels for loss computation, and multi-dataset evaluation showing competitive Dice and NSD with fast whole-body inference. TheSAL T framework demonstrates potential for real-time clinical deployment, standardized hierarchical annotations, and applicability to other domains with hierarchical label structures, marking a step toward scalable, interpretable, and efficient medical image analysis.

Abstract

Traditional segmentation networks approach anatomical structures as standalone elements, overlooking the intrinsic hierarchical connections among them. This study introduces Softmax for Arbitrary Label Trees (SALT), a novel approach designed to leverage the hierarchical relationships between labels, improving the efficiency and interpretability of the segmentations. This study introduces a novel segmentation technique for CT imaging, which leverages conditional probabilities to map the hierarchical structure of anatomical landmarks, such as the spine's division into lumbar, thoracic, and cervical regions and further into individual vertebrae. The model was developed using the SAROS dataset from The Cancer Imaging Archive (TCIA), comprising 900 body region segmentations from 883 patients. The dataset was further enhanced by generating additional segmentations with the TotalSegmentator, for a total of 113 labels. The model was trained on 600 scans, while validation and testing were conducted on 150 CT scans. Performance was assessed using the Dice score across various datasets, including SAROS, CT-ORG, FLARE22, LCTSC, LUNA16, and WORD. Among the evaluated datasets, SALT achieved its best results on the LUNA16 and SAROS datasets, with Dice scores of 0.93 and 0.929 respectively. The model demonstrated reliable accuracy across other datasets, scoring 0.891 on CT-ORG and 0.849 on FLARE22. The LCTSC dataset showed a score of 0.908 and the WORD dataset also showed good performance with a score of 0.844. SALT used the hierarchical structures inherent in the human body to achieve whole-body segmentations with an average of 35 seconds for 100 slices. This rapid processing underscores its potential for integration into clinical workflows, facilitating the automatic and efficient computation of full-body segmentations with each CT scan, thus enhancing diagnostic processes and patient care.
Paper Structure (18 sections, 8 equations, 10 figures, 10 tables)

This paper contains 18 sections, 8 equations, 10 figures, 10 tables.

Figures (10)

  • Figure 1: Example of a hierarchical segmentation. A) In the thorax, the biggest region is the thoracic cavity. B) The thoracic cavity can then be subdivided into lungs and mediastinum. C) The mediastinum has a further subregion the pericardium, and the lungs can be subdivided into lower/middle/upper left/right lobes. D) The pericardium contains the heart, which can be subdivided into myocardium and left/right ventricle/atrium. The full segmentations originate from the TotalSegmentator, while the sparse segmentations are from SAROS.
  • Figure 2: Hierarchical labeling for the segmentation of anatomical landmarks. The blue labels were generated with the TotalSegmentator, while the pink labels come from the SAROS dataset. The gray labels were also generated: "body" is the sum of all annotated voxels, while "background" is all non-annotated voxels. The "other" classes were created as the parts of the parents which were not annotated. The light pink labels were generated by splitting an existing TotalSegmentator label. Some vertebrae and ribs have been removed from this visualization for a better overview. For some bones and muscles with only left and right children, the two nodes were fused for better visualization.
  • Figure 3: Different representations of the tree class hierarchy. From left to right: (1) The adjacency matrix encodes directed edges from parent to child nodes. (2) The reachability matrix encodes all nodes in the path from the root node to a specific node. (3) The sibling matrix encodes all sibling nodes for a specific node.
  • Figure 4: Overview of the SALT architecture. A DynUNet model is used to output feature maps containing $N$ channels, where $N$ is the number of nodes of the tree. The SALT layer builds conditional probabilities from the nodes of the tree and can be used analogously to a softmax function to create a segmentation.
  • Figure 5: Comparison between the speed of SALT and the TotalSegmentator. Version 1 and 2 of the TotalSegmentator were run on the same set of 472 CT scans from \ref{['tab:datasets']}. The TotalSegmentator models and SALT were trained on 1.5mm isotropic spacing.
  • ...and 5 more figures