Table of Contents
Fetching ...

Ultrasound Image Segmentation of Thyroid Nodule via Latent Semantic Feature Co-Registration

Xuewei Li, Yaqiao Zhu, Jie Gao, Xi Wei, Ruixuan Zhang, Yuan Tian, ZhiQiang Liu

TL;DR

This paper addresses the poor cross-device generalization of thyroid nodule segmentation in ultrasound by introducing ASTN, a co-registration–based framework that leverages latent semantic features via an atlas dictionary and a Half-STN. It pairs an atlas selection mechanism (Regional Correlation Score) with a dictionary system that couples semantic extraction and deformation fusion, culminating in a robust warped-label fusion strategy. On multi-device thyroid ultrasound data, ASTN achieves strong cross-domain performance, with a pre-fusion co-registration DSC around 88.6% and IoU improvements of about $1.34\%$ (known domain) and $6.52\%$ (unseen domain), and overall segmentation gains of several percentage points across metrics. These results indicate that focusing registration on lesion semantics and carefully constructed atlases can substantially improve both registration fidelity and segmentation accuracy in clinically realistic, device-heterogeneous settings.

Abstract

Segmentation of nodules in thyroid ultrasound imaging plays a crucial role in the detection and treatment of thyroid cancer. However, owing to the diversity of scanner vendors and imaging protocols in different hospitals, the automatic segmentation model, which has already demonstrated expert-level accuracy in the field of medical image segmentation, finds its accuracy reduced as the result of its weak generalization performance when being applied in clinically realistic environments. To address this issue, the present paper proposes ASTN, a framework for thyroid nodule segmentation achieved through a new type co-registration network. By extracting latent semantic information from the atlas and target images and utilizing in-depth features to accomplish the co-registration of nodules in thyroid ultrasound images, this framework can ensure the integrity of anatomical structure and reduce the impact on segmentation as the result of overall differences in image caused by different devices. In addition, this paper also provides an atlas selection algorithm to mitigate the difficulty of co-registration. As shown by the evaluation results collected from the datasets of different devices, thanks to the method we proposed, the model generalization has been greatly improved while maintaining a high level of segmentation accuracy.

Ultrasound Image Segmentation of Thyroid Nodule via Latent Semantic Feature Co-Registration

TL;DR

This paper addresses the poor cross-device generalization of thyroid nodule segmentation in ultrasound by introducing ASTN, a co-registration–based framework that leverages latent semantic features via an atlas dictionary and a Half-STN. It pairs an atlas selection mechanism (Regional Correlation Score) with a dictionary system that couples semantic extraction and deformation fusion, culminating in a robust warped-label fusion strategy. On multi-device thyroid ultrasound data, ASTN achieves strong cross-domain performance, with a pre-fusion co-registration DSC around 88.6% and IoU improvements of about (known domain) and (unseen domain), and overall segmentation gains of several percentage points across metrics. These results indicate that focusing registration on lesion semantics and carefully constructed atlases can substantially improve both registration fidelity and segmentation accuracy in clinically realistic, device-heterogeneous settings.

Abstract

Segmentation of nodules in thyroid ultrasound imaging plays a crucial role in the detection and treatment of thyroid cancer. However, owing to the diversity of scanner vendors and imaging protocols in different hospitals, the automatic segmentation model, which has already demonstrated expert-level accuracy in the field of medical image segmentation, finds its accuracy reduced as the result of its weak generalization performance when being applied in clinically realistic environments. To address this issue, the present paper proposes ASTN, a framework for thyroid nodule segmentation achieved through a new type co-registration network. By extracting latent semantic information from the atlas and target images and utilizing in-depth features to accomplish the co-registration of nodules in thyroid ultrasound images, this framework can ensure the integrity of anatomical structure and reduce the impact on segmentation as the result of overall differences in image caused by different devices. In addition, this paper also provides an atlas selection algorithm to mitigate the difficulty of co-registration. As shown by the evaluation results collected from the datasets of different devices, thanks to the method we proposed, the model generalization has been greatly improved while maintaining a high level of segmentation accuracy.
Paper Structure (18 sections, 12 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 18 sections, 12 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison of Thyroid Nodule with Other Body Parts
  • Figure 2: Regional Correlation Score. The picture above illustrates the RCS computation process when $M$ is 9. The red dots represent the centroids $C_m$ of the $m$ region, and the black dot represents the centroid $C^{\prime}$ of the entire nodule. The number of nodule pixels in the current region is $N_w$, and the number of background pixels is $N_b$
  • Figure 3: Dictionary System. Half-STN denoted as HS, WA stands for warped atlas
  • Figure 4: Overview of our ASTN. ASTN encompasses the components of Atlas Selection, Semantic Extraction, and Deformation Fusion. During the training, a fixed selected atlas is inputted into the network along with the target ultrasound image(target US). The Deformation Fusion module leverages the semantic features and the initial segmentation result provided by Semantic Extraction to generate the final segmentation. The Feature Combination and Weighted label Fusion are depicted in Fig. \ref{['feature']} and Fig. \ref{['fuse']}, respectively.
  • Figure 5: Feature Combination. The upper blue encoder receives $I_A$, generating features of dimensions $M\times N$. The lower green encoder receives the $I_T$ and obtains features of dimensions $1\times N$. After combining the two into a $M\times N$ dimensional feature, it is used as an input to the red half-STN
  • ...and 4 more figures