A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning
Yuelin Zhang, Pengyu Zheng, Wanquan Yan, Chengyu Fang, Shing Shin Cheng
TL;DR
This paper tackles defocus blur in microscopy by addressing two core challenges: long-range cross-scale attention and data deficiency. It introduces a unified framework that combines a Multi-Pyramid Transformer (MPT) with cross-scale window attention (CSWA), intra-scale channel attention (ISCA), and a feature-enhancing feed-forward network (FEFN), together with Extended Frequency Contrastive Regularization (EFCR) to learn from frequency bands and enable cross-domain knowledge transfer. The method is validated on diverse cell and surgical microscopy datasets, including new CaDISBlur and CataBlur datasets, achieving state-of-the-art performance in supervised and unsupervised settings and improving downstream tasks such as cell detection and surgical scene segmentation. The results demonstrate the practical impact of leveraging explicit multi-scale pyramids and frequency-domain contrastive learning for robust microscopy deblurring and cross-domain knowledge transfer, with substantial gains in restoration quality and downstream applicability.
Abstract
Defocus blur is a persistent problem in microscope imaging that poses harm to pathology interpretation and medical intervention in cell microscopy and microscope surgery. To address this problem, a unified framework including the multi-pyramid transformer (MPT) and extended frequency contrastive regularization (EFCR) is proposed to tackle two outstanding challenges in microscopy deblur: longer attention span and data deficiency. The MPT employs an explicit pyramid structure at each network stage that integrates the cross-scale window attention (CSWA), the intra-scale channel attention (ISCA), and the feature-enhancing feed-forward network (FEFN) to capture long-range cross-scale spatial interaction and global channel context. The EFCR addresses the data deficiency problem by exploring latent deblur signals from different frequency bands. It also enables deblur knowledge transfer to learn cross-domain information from extra data, improving deblur performance for labeled and unlabeled data. Extensive experiments and downstream task validation show the framework achieves state-of-the-art performance across multiple datasets. Project page: https://github.com/PieceZhang/MPT-CataBlur.
