Table of Contents
Fetching ...

Lightweight Adaptive Feature De-drifting for Compressed Image Classification

Long Peng, Yang Cao, Yuejin Sun, Yang Wang

TL;DR

JPEG compression causes blockwise artifacts that induce spatially varying feature drift in deep classifiers. The authors propose a lightweight Adaptive Feature De-drifting (AFD) module comprising a Feature Drifting Estimation Network (FDE-Net) operating in the DCT domain and a Feature Enhancement Network (FE-Net) that uses a RepConv-based architecture to map degraded features to high-quality representations, guided by a per-block Feature Drifting Map (FDM). The module is trainable on limited data and is plug-and-play with pre-trained networks, achieving substantial accuracy gains on JPEG-compressed images while maintaining mobile efficiency; it also demonstrates robustness under multiple compressions and across lossy formats like AVIF. Across ImageNet-C, DIV2k-C, and AVIF experiments, AFD outperforms traditional JPEG artifact removal methods and task-driven baselines, with lower computational cost and better generalization, enabling practical high-level vision on constrained devices.

Abstract

JPEG is a widely used compression scheme to efficiently reduce the volume of transmitted images. The artifacts appear among blocks due to the information loss, which not only affects the quality of images but also harms the subsequent high-level tasks in terms of feature drifting. High-level vision models trained on high-quality images will suffer performance degradation when dealing with compressed images, especially on mobile devices. Numerous learning-based JPEG artifact removal methods have been proposed to handle visual artifacts. However, it is not an ideal choice to use these JPEG artifact removal methods as a pre-processing for compressed image classification for the following reasons: 1. These methods are designed for human vision rather than high-level vision models; 2. These methods are not efficient enough to serve as pre-processing on resource-constrained devices. To address these issues, this paper proposes a novel lightweight AFD module to boost the performance of pre-trained image classification models when facing compressed images. First, a FDE-Net is devised to generate the spatial-wise FDM in the DCT domain. Next, the estimated FDM is transmitted to the FE-Net to generate the mapping relationship between degraded features and corresponding high-quality features. A simple but effective RepConv block equipped with structural re-parameterization is utilized in FE-Net, which enriches feature representation in the training phase while maintaining efficiency in the deployment phase. After training on limited compressed images, the AFD-Module can serve as a "plug-and-play" model for pre-trained classification models to improve their performance on compressed images. Experiments demonstrate that our proposed AFD module can comprehensively improve the accuracy of the pre-trained classification models and significantly outperform the existing methods.

Lightweight Adaptive Feature De-drifting for Compressed Image Classification

TL;DR

JPEG compression causes blockwise artifacts that induce spatially varying feature drift in deep classifiers. The authors propose a lightweight Adaptive Feature De-drifting (AFD) module comprising a Feature Drifting Estimation Network (FDE-Net) operating in the DCT domain and a Feature Enhancement Network (FE-Net) that uses a RepConv-based architecture to map degraded features to high-quality representations, guided by a per-block Feature Drifting Map (FDM). The module is trainable on limited data and is plug-and-play with pre-trained networks, achieving substantial accuracy gains on JPEG-compressed images while maintaining mobile efficiency; it also demonstrates robustness under multiple compressions and across lossy formats like AVIF. Across ImageNet-C, DIV2k-C, and AVIF experiments, AFD outperforms traditional JPEG artifact removal methods and task-driven baselines, with lower computational cost and better generalization, enabling practical high-level vision on constrained devices.

Abstract

JPEG is a widely used compression scheme to efficiently reduce the volume of transmitted images. The artifacts appear among blocks due to the information loss, which not only affects the quality of images but also harms the subsequent high-level tasks in terms of feature drifting. High-level vision models trained on high-quality images will suffer performance degradation when dealing with compressed images, especially on mobile devices. Numerous learning-based JPEG artifact removal methods have been proposed to handle visual artifacts. However, it is not an ideal choice to use these JPEG artifact removal methods as a pre-processing for compressed image classification for the following reasons: 1. These methods are designed for human vision rather than high-level vision models; 2. These methods are not efficient enough to serve as pre-processing on resource-constrained devices. To address these issues, this paper proposes a novel lightweight AFD module to boost the performance of pre-trained image classification models when facing compressed images. First, a FDE-Net is devised to generate the spatial-wise FDM in the DCT domain. Next, the estimated FDM is transmitted to the FE-Net to generate the mapping relationship between degraded features and corresponding high-quality features. A simple but effective RepConv block equipped with structural re-parameterization is utilized in FE-Net, which enriches feature representation in the training phase while maintaining efficiency in the deployment phase. After training on limited compressed images, the AFD-Module can serve as a "plug-and-play" model for pre-trained classification models to improve their performance on compressed images. Experiments demonstrate that our proposed AFD module can comprehensively improve the accuracy of the pre-trained classification models and significantly outperform the existing methods.
Paper Structure (26 sections, 8 equations, 11 figures, 10 tables)

This paper contains 26 sections, 8 equations, 11 figures, 10 tables.

Figures (11)

  • Figure 1: The degree of quality degradation varies greatly from region to region and has a strong correlation with the richness of details. The regions with more structure details suffer more high-frequency information loss, which leads to more heavy feature drifting problems for image recognition.
  • Figure 2: The artifacts induced by JPEG compression will lead to feature drifting for pre-trained models. We extract the features of high-quality blocks and compressed blocks from the first convolution layer of pre-trained RepVGG-A2. Then, we calculate the angle between the degraded features and the corresponding high-quality features for each pair of blocks, as shown in (a). Since JPEG compression applies DCT and quantization to each block individually, the correlation between adjacent blocks is ignored, which results in the spatially varied feature drifting. With the increase of compression degree, the inconsistency of feature drifting across regions increases gradually. (b) The feature angle map between high-quality features and enhanced features by MemNet tai2017memnet. The MemNet can alleviate feature drifting to some extent for images with high compression ratio, as shown in the red box but introduce the opposite effect for images with low compression ratio, as shown in the black box.
  • Figure 3: The degraded images from DIV2K-C agustsson2017ntire. The first row represents different quality factors (QF) in a single compression. The second row represents the results in multiple compressions where QF is randomly selected from 25, 18, 15, 10, and 7.
  • Figure 4: The illustration of statistical observations. We evaluate four JPEG compression ratios (QF = 10, 25, 50, and 60). The line with different colors represents the relationship between the DCT coefficients angle and feature response angle for images with different compression-ratio.
  • Figure 5: The architecture of our proposed Adaptive Feature De-drifting Module (AFD-Module). Our network contains two sub-networks: Feature Enhancement Network (FE-Net) and feature drifting Estimation Network (FDE-Net). The FE-Net is a lightweight U-Net-style network combined with the RepConv block. The FDE-Net is fed with the rearranged DCT coefficients, and the estimation map is obtained for spatial-wise degraded feature de-drifting. After training, the AFD-Module is plugged into the corresponding pre-trained model to boost the performance on compressed images.
  • ...and 6 more figures