Table of Contents
Fetching ...

UMSPU: Universal Multi-Size Phase Unwrapping via Mutual Self-Distillation and Adaptive Boosting Ensemble Segmenters

Lintong Du, Huazhen Liu, Yijia Zhang, ShuXin Liu, Yuan Qu, Zenghui Zhang, Jiamiao Yang

TL;DR

UMSPU addresses the challenge of high-resolution phase unwrapping by introducing two core innovations: Mutual Self-Distillation (MSD) for cross-layer, cross-resolution semantic learning, and an adaptive boosting ensemble of segmenters to cover a broad range of spatial frequencies, complemented by a curl loss to enforce irrotational gradient fields. Phase reconstruction leverages a Discrete Cosine Transform (DCT) to recover wrap counts from predicted gradients, enabling accurate unwrapping over a 64-fold resolution range with high throughput (approximately $22.66$ ms per high-resolution image and over 40 FPS). Empirical results show that UMSPU outperforms six baselines across resolutions and fringe densities, with strong generalization to structured light and InSAR data, and robustness under translation and rotation. The proposed approach offers a practical, scalable solution for industrial-scale phase measurements, pushing deep-learning-based phase unwrapping toward real-world deployment.

Abstract

Spatial phase unwrapping is a key technique for extracting phase information to obtain 3D morphology and other features. Modern industrial measurement scenarios demand high precision, large image sizes, and high speed. However, conventional methods struggle with noise resistance and processing speed. Current deep learning methods are limited by the receptive field size and sparse semantic information, making them ineffective for large size images. To address this issue, we propose a mutual self-distillation (MSD) mechanism and adaptive boosting ensemble segmenters to construct a universal multi-size phase unwrapping network (UMSPU). MSD performs hierarchical attention refinement and achieves cross-layer collaborative learning through bidirectional distillation, ensuring fine-grained semantic representation across image sizes. The adaptive boosting ensemble segmenters combine weak segmenters with different receptive fields into a strong one, ensuring stable segmentation across spatial frequencies. Experimental results show that UMSPU overcomes image size limitations, achieving high precision across image sizes ranging from 256*256 to 2048*2048 (an 8 times increase). It also outperforms existing methods in speed, robustness, and generalization. Its practicality is further validated in structured light imaging and InSAR. We believe that UMSPU offers a universal solution for phase unwrapping, with broad potential for industrial applications.

UMSPU: Universal Multi-Size Phase Unwrapping via Mutual Self-Distillation and Adaptive Boosting Ensemble Segmenters

TL;DR

UMSPU addresses the challenge of high-resolution phase unwrapping by introducing two core innovations: Mutual Self-Distillation (MSD) for cross-layer, cross-resolution semantic learning, and an adaptive boosting ensemble of segmenters to cover a broad range of spatial frequencies, complemented by a curl loss to enforce irrotational gradient fields. Phase reconstruction leverages a Discrete Cosine Transform (DCT) to recover wrap counts from predicted gradients, enabling accurate unwrapping over a 64-fold resolution range with high throughput (approximately ms per high-resolution image and over 40 FPS). Empirical results show that UMSPU outperforms six baselines across resolutions and fringe densities, with strong generalization to structured light and InSAR data, and robustness under translation and rotation. The proposed approach offers a practical, scalable solution for industrial-scale phase measurements, pushing deep-learning-based phase unwrapping toward real-world deployment.

Abstract

Spatial phase unwrapping is a key technique for extracting phase information to obtain 3D morphology and other features. Modern industrial measurement scenarios demand high precision, large image sizes, and high speed. However, conventional methods struggle with noise resistance and processing speed. Current deep learning methods are limited by the receptive field size and sparse semantic information, making them ineffective for large size images. To address this issue, we propose a mutual self-distillation (MSD) mechanism and adaptive boosting ensemble segmenters to construct a universal multi-size phase unwrapping network (UMSPU). MSD performs hierarchical attention refinement and achieves cross-layer collaborative learning through bidirectional distillation, ensuring fine-grained semantic representation across image sizes. The adaptive boosting ensemble segmenters combine weak segmenters with different receptive fields into a strong one, ensuring stable segmentation across spatial frequencies. Experimental results show that UMSPU overcomes image size limitations, achieving high precision across image sizes ranging from 256*256 to 2048*2048 (an 8 times increase). It also outperforms existing methods in speed, robustness, and generalization. Its practicality is further validated in structured light imaging and InSAR. We believe that UMSPU offers a universal solution for phase unwrapping, with broad potential for industrial applications.

Paper Structure

This paper contains 20 sections, 21 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: In the wrap count gradient method, the neural network classifies the wrap count gradient (0, +1, -1) of each point. After that, the wrap count is restored through the least squares method. Then, multiplying by 2$\pi$ and adding it to the wrapped phase yields the unwrapped phase.
  • Figure 2: UMSPU comprises two components: (a) Semantic Information Extraction via Mutual Self-Distillation: This mechanism leverages mutual self-distillation (MSD) to perform attention distillation on feature maps from the encoder and decoder at the same size, enabling mutual representation learning. As shown in (c), MSD extracts attention maps from the encoder and decoder feature maps, applies a softmax operation along the spatial dimension to generate attention soft labels, and computes bidirectional KL divergence for mutual distillation. (b) Adaptive Boosting Ensemble Segmenters: This component employs an adaptive boosting strategy to integrate three weak segmenters with different dilation rates into a strong segmenter, accommodating semantic features of varying spatial frequencies. The intermediate feature maps generated by (a) are passed to (b) to produce the final gradient segmentation result.
  • Figure 3: (a) Comparison of the attention maps of E1, E2, D2 and D1 before and after adding MSD with a 256×256 low-size input; (b) Comparison of the attention maps of E1, E2, D2 and D1 before and after adding MSD with a 1024×1024 high-size input.
  • Figure 4: (a) In Adaptive Boosting, weak segmenter and sample weights are updated in every training round.(b) Three weak segmenters are paired into 3 training groups, with two sub-segmenters alternately selected for each training batch.
  • Figure 5: The curl estimation method uses two fixed convolution kernels to find curl points. After the convolution operation with these two fixed-weight convolutions, the points with values of 2 or -2 are identified as curl points.
  • ...and 6 more figures