Vision Transformer with Key-select Routing Attention for Single Image Dehazing
Lihan Tong, Weijia Li, Qingxia Yang, Liyuan Chen, Peng Chen
TL;DR
Ksformer tackles single-image dehazing by integrating a content-aware Multi-scale Key-select Routing Attention (MKRA) with a Lightweight Frequency Processing Module (LFPM). The MKRA selects the most informative regions via a top-k mechanism across multi-scale windows, enabling long-range dependencies with reduced computation, while the LFPM enhances high-frequency details using minimal parameters. In experiments on synthetic and real hazy datasets, Ksformer delivers state-of-the-art PSNR/SSIM with a compact 5.8M-parameter model, and ablation studies confirm the contribution of MKRA and LFPM. The approach demonstrates that combining content-aware spatial attention with spectral feature emphasis can achieve high dehazing quality efficiently, though further work is needed to reduce GFLOPs for embedded deployment.
Abstract
We present Ksformer, utilizing Multi-scale Key-select Routing Attention (MKRA) for intelligent selection of key areas through multi-channel, multi-scale windows with a top-k operator, and Lightweight Frequency Processing Module (LFPM) to enhance high-frequency features, outperforming other dehazing methods in tests.
