CT Liver Segmentation via PVT-based Encoding and Refined Decoding

Debesh Jha; Nikhil Kumar Tomar; Koushik Biswas; Gorkem Durak; Alpay Medetalibeyoglu; Matthew Antalek; Yury Velichko; Daniela Ladner; Amir Borhani; Ulas Bagci

CT Liver Segmentation via PVT-based Encoding and Refined Decoding

Debesh Jha, Nikhil Kumar Tomar, Koushik Biswas, Gorkem Durak, Alpay Medetalibeyoglu, Matthew Antalek, Yury Velichko, Daniela Ladner, Amir Borhani, Ulas Bagci

TL;DR

This work addresses accurate CT liver segmentation by introducing PVTFormer, an encoder-decoder architecture that uses a pretrained Pyramid Vision Transformer v2 backbone with residual upsampling and a hierarchical decoding strategy to enhance multi-scale feature fusion. The model achieves state-of-the-art performance on the LiTS dataset, boasting a Dice coefficient of 86.78%, mean IoU of 78.46%, and a Hausdorff distance of 3.50, while handling complex liver boundaries. The approach demonstrates the effectiveness of transformer-based encoders combined with specialized upsampling and decoding blocks for precise organ delineation, suggesting strong potential for multi-organ abdominal CT segmentation in clinical workflows. Overall, PVTFormer advances liver segmentation accuracy and provides a scalable framework for extending to additional abdominal imaging tasks.

Abstract

Accurate liver segmentation from CT scans is essential for effective diagnosis and treatment planning. Computer-aided diagnosis systems promise to improve the precision of liver disease diagnosis, disease progression, and treatment planning. In response to the need, we propose a novel deep learning approach, \textit{\textbf{PVTFormer}}, that is built upon a pretrained pyramid vision transformer (PVT v2) combined with advanced residual upsampling and decoder block. By integrating a refined feature channel approach with a hierarchical decoding strategy, PVTFormer generates high quality segmentation masks by enhancing semantic features. Rigorous evaluation of the proposed method on Liver Tumor Segmentation Benchmark (LiTS) 2017 demonstrates that our proposed architecture not only achieves a high dice coefficient of 86.78\%, mIoU of 78.46\%, but also obtains a low HD of 3.50. The results underscore PVTFormer's efficacy in setting a new benchmark for state-of-the-art liver segmentation methods. The source code of the proposed PVTFormer is available at \url{https://github.com/DebeshJha/PVTFormer}.

CT Liver Segmentation via PVT-based Encoding and Refined Decoding

TL;DR

Abstract

CT Liver Segmentation via PVT-based Encoding and Refined Decoding

Authors

TL;DR

Abstract

Table of Contents

Figures (2)