Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image Segmentation
Kaili Qi, Zhongyi Huang, Wenli Yang
TL;DR
This work tackles the challenge of segmenting noisy images with blurred boundaries by marrying variational PDE priors with deep learning in a robust VM_TUNet. The method integrates an edge detector and a mean curvature term into a modified Cahn-Hilliard framework, and couples Fourier-domain preprocessing (F) with a tailored finite point-based local computation (T). Two UNet-like subnetworks learn boundary-adaptive constants to guide the solver, yielding a final operator that balances interpretability, boundary fidelity, and efficiency. Empirical results on three datasets under Gaussian noise show competitive segmentation quality, often surpassing pure CNNs and approaching transformer-based methods with reasonable computational cost. The work highlights a principled, hybrid pathway for robust segmentation in challenging, noise-heavy scenarios.
Abstract
To address the challenge of segmenting noisy images with blurred or fragmented boundaries, this paper presents a robust version of Variational Model Based Tailored UNet (VM_TUNet), a hybrid framework that integrates variational methods with deep learning. The proposed approach incorporates physical priors, an edge detector and a mean curvature term, into a modified Cahn-Hilliard equation, aiming to combine the interpretability and boundary-smoothing advantages of variational partial differential equations (PDEs) with the strong representational ability of deep neural networks. The architecture consists of two collaborative modules: an F module, which conducts efficient frequency domain preprocessing to alleviate poor local minima, and a T module, which ensures accurate and stable local computations, backed by a stability estimate. Extensive experiments on three benchmark datasets indicate that the proposed method achieves a balanced trade-off between performance and computational efficiency, which yields competitive quantitative results and improved visual quality compared to pure convolutional neural network (CNN) based models, while achieving performance close to that of transformer-based method with reasonable computational expense.
