Distilled Pooling Transformer Encoder for Efficient Realistic Image Dehazing

Le-Anh Tran; Dong-Chul Park

Distilled Pooling Transformer Encoder for Efficient Realistic Image Dehazing

Le-Anh Tran, Dong-Chul Park

TL;DR

Experimental results on various benchmark datasets have shown that the proposed DPTE-Net can achieve competitive dehazing performance when compared to state-of-the-art methods while maintaining low computational complexity, making it a promising solution for resource-limited applications.

Abstract

This paper proposes a lightweight neural network designed for realistic image dehazing, utilizing a Distilled Pooling Transformer Encoder, named DPTE-Net. Recently, while vision transformers (ViTs) have achieved great success in various vision tasks, their self-attention (SA) module's complexity scales quadratically with image resolution, hindering their applicability on resource-constrained devices. To overcome this, the proposed DPTE-Net substitutes traditional SA modules with efficient pooling mechanisms, significantly reducing computational demands while preserving ViTs' learning capabilities. To further enhance semantic feature learning, a distillation-based training process is implemented which transfers rich knowledge from a larger teacher network to DPTE-Net. Additionally, DPTE-Net is trained within a generative adversarial network (GAN) framework, leveraging the strong generalization of GAN in image restoration, and employs a transmission-aware loss function to dynamically adapt to varying haze densities. Experimental results on various benchmark datasets have shown that the proposed DPTE-Net can achieve competitive dehazing performance when compared to state-of-the-art methods while maintaining low computational complexity, making it a promising solution for resource-limited applications. The code of this work is available at https://github.com/tranleanh/dpte-net.

Distilled Pooling Transformer Encoder for Efficient Realistic Image Dehazing

TL;DR

Abstract

Distilled Pooling Transformer Encoder for Efficient Realistic Image Dehazing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (17)