An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques

Chunxiao Li; Xiaoxiao Wang; Boming Miao; Chuanlong Xie; Zizhe Wang; Yao Zhu

An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques

Chunxiao Li, Xiaoxiao Wang, Boming Miao, Chuanlong Xie, Zizhe Wang, Yao Zhu

TL;DR

This paper tackles the gap between fast, accurate discriminative classifiers and slower, sometimes weaker diffusion-based zero‑shot methods by proposing DBMEF, a training‑free framework that augments discriminative models with a diffusion‑based rethinking module. A Confidence Protector decides when re-evaluation is needed, and a diffusion classifier uses conditional denoising with both positive and negative text conditions, merged through a negative control factor and supported by voting. The approach delivers universal improvements across 17 backbones (CNNs and Transformers) on ImageNet and robustness to distribution shifts and low‑resolution data, while drastically reducing diffusion‑sampling time relative to prior diffusion classifiers. These results suggest a practical path to leverage diffusion models to enhance discriminative performance without retraining or heavy computation.

Abstract

Image classification serves as the cornerstone of computer vision, traditionally achieved through discriminative models based on deep neural networks. Recent advancements have introduced classification methods derived from generative models, which offer the advantage of zero-shot classification. However, these methods suffer from two main drawbacks: high computational overhead and inferior performance compared to discriminative models. Inspired by the coordinated cognitive processes of rapid-slow pathway interactions in the human brain during visual signal recognition, we propose the Diffusion-Based Discriminative Model Enhancement Framework (DBMEF). This framework seamlessly integrates discriminative and generative models in a training-free manner, leveraging discriminative models for initial predictions and endowing deep neural networks with rethinking capabilities via diffusion models. Consequently, DBMEF can effectively enhance the classification accuracy and generalization capability of discriminative models in a plug-and-play manner. We have conducted extensive experiments across 17 prevalent deep model architectures with different training methods, including both CNN-based models such as ResNet and Transformer-based models like ViT, to demonstrate the effectiveness of the proposed DBMEF. Specifically, the framework yields a 1.51\% performance improvement for ResNet-50 on the ImageNet dataset and 3.02\% on the ImageNet-A dataset. In conclusion, our research introduces a novel paradigm for image classification, demonstrating stable improvements across different datasets and neural networks. The code is available at https://github.com/ChunXiaostudy/DBMEF.

An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques

TL;DR

Abstract

An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)