Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling
Long Tang, Guoquan Zhen, Jie Hao, Jianbo Zhang, Huiyu Duan, Liang Yuan, Guangtao Zhai
TL;DR
Life-IQA tackles blind image quality assessment by showing deep features dominate quality prediction and by designing a decoder that exploits this with a GCN-enhanced layer interaction and a MoE-based feature decoupling head. The method couples Stage4-derived queries with Stage3-derived keys/values through cross-attention and partitions the fused representation into distortion-specific cues via a sparse MoE head, producing robust quality scores. Extensive experiments on seven BIQA benchmarks demonstrate state-of-the-art performance with favorable accuracy-efficiency tradeoffs and strong cross-dataset generalization, supported by comprehensive ablations and visual analyses. The approach provides practical insights into deep-feature contributions for BIQA and offers a data-efficient decoding paradigm for distortion-aware quality estimation.
Abstract
Blind image quality assessment (BIQA) plays a crucial role in evaluating and optimizing visual experience. Most existing BIQA approaches fuse shallow and deep features extracted from backbone networks, while overlooking the unequal contributions to quality prediction. Moreover, while various vision encoder backbones are widely adopted in BIQA, the effective quality decoding architectures remain underexplored. To address these limitations, this paper investigates the contributions of shallow and deep features to BIQA, and proposes a effective quality feature decoding framework via GCN-enhanced \underline{l}ayer\underline{i}nteraction and MoE-based \underline{f}eature d\underline{e}coupling, termed \textbf{(Life-IQA)}. Specifically, the GCN-enhanced layer interaction module utilizes the GCN-enhanced deepest-layer features as query and the penultimate-layer features as key, value, then performs cross-attention to achieve feature interaction. Moreover, a MoE-based feature decoupling module is proposed to decouple fused representations though different experts specialized for specific distortion types or quality dimensions. Extensive experiments demonstrate that Life-IQA shows more favorable balance between accuracy and cost than a vanilla Transformer decoder and achieves state-of-the-art performance on multiple BIQA benchmarks.The code is available at: \href{https://github.com/TANGLONG2/Life-IQA/tree/main}{\texttt{Life-IQA}}.
