LV-UNet: A Lightweight and Vanilla Model for Medical Image Segmentation

Juntao Jiang; Mengmeng Wang; Huizhong Tian; Lingbo Cheng; Yong Liu

LV-UNet: A Lightweight and Vanilla Model for Medical Image Segmentation

Juntao Jiang, Mengmeng Wang, Huizhong Tian, Lingbo Cheng, Yong Liu

TL;DR

LV-UNet addresses the need for lightweight, robust medical image segmentation suitable for point-of-care and mobile devices. It combines a pre-trained MobileNetv3-Large encoder with fusible expansion modules and a deep training strategy, followed by re-parametrization to deployment mode to reduce parameters and FLOPs. On five diverse datasets (ISIC2016, BUSI, CVC-ClinicDB, CVC-ColonDB, Kvair-SEG), it achieves competitive segmentation accuracy with significantly reduced computational cost compared to state-of-the-art and vanilla baselines. The study demonstrates a practical design pattern—merging pre-trained backbones with fusible modules and re-parametrization—that could guide future lightweight medical image segmentation research.

Abstract

While large models have achieved significant progress in computer vision, challenges such as optimization complexity, the intricacy of transformer architectures, computational constraints, and practical application demands highlight the importance of simpler model designs in medical image segmentation. This need is particularly pronounced in mobile medical devices, which require lightweight, deployable models with real-time performance. However, existing lightweight models often suffer from poor robustness across datasets, limiting their widespread adoption. To address these challenges, this paper introduces LV-UNet, a lightweight and vanilla model that leverages pre-trained MobileNetv3-Large backbones and incorporates fusible modules. LV-UNet employs an enhanced deep training strategy and switches to a deployment mode during inference by re-parametrization, significantly reducing parameter count and computational overhead. Experimental results on ISIC 2016, BUSI, CVC-ClinicDB, CVC-ColonDB, and Kvair-SEG datasets demonstrate a better trade-off between performance and the computational load. The code will be released at https://github.com/juntaoJianggavin/LV-UNet.

LV-UNet: A Lightweight and Vanilla Model for Medical Image Segmentation

TL;DR

Abstract

Paper Structure (26 sections, 6 equations, 3 figures, 7 tables)

This paper contains 26 sections, 6 equations, 3 figures, 7 tables.

Introduction
Related Works
Lightweight medical image segmentation models
VanillaNet
MobileNetv3
Motivation
The Role of Pre-training
Efficient Model Deployment through Re-parametrization
Methods
Overall Architecture
Pre-trained Modules in Encoder
Fusible Modules
Architecture
Nonlinear Activation Layer
Re-parametrization and Deployment Mode
...and 11 more sections

Figures (3)

Figure 1: The architecture of LV-UNet: the basic modules include pre-trained MobileNetv3-Large blocks(the initial convolution stage and the group i@ to iii@ (the first inverted residual block to ninth), fusible encoder blocks, fusible decoder blocks, skip-connections, and the output block.
Figure 2: The architecture of the fusible blocks in the training and deployment modes
Figure 3: Example visualizations of segmentation results of different models.

LV-UNet: A Lightweight and Vanilla Model for Medical Image Segmentation

TL;DR

Abstract

LV-UNet: A Lightweight and Vanilla Model for Medical Image Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)