Table of Contents
Fetching ...

AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs

Yunling Zheng, Zeyi Xu, Fanghui Xue, Biao Yang, Jiancheng Lyu, Shuai Zhang, Yingyong Qi, Jack Xin

TL;DR

An alternating Fourier and image domain filtering approach for feature extraction as an efficient alternative to build a vision backbone without using the computationally intensive attention is proposed and demonstrated.

Abstract

We propose and demonstrate an alternating Fourier and image domain filtering approach for feature extraction as an efficient alternative to build a vision backbone without using the computationally intensive attention. The performance among the lightweight models reaches the state-of-the-art level on ImageNet-1K classification, and improves downstream tasks on object detection and segmentation consistently as well. Our approach also serves as a new tool to compress vision transformers (ViTs).

AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs

TL;DR

An alternating Fourier and image domain filtering approach for feature extraction as an efficient alternative to build a vision backbone without using the computationally intensive attention is proposed and demonstrated.

Abstract

We propose and demonstrate an alternating Fourier and image domain filtering approach for feature extraction as an efficient alternative to build a vision backbone without using the computationally intensive attention. The performance among the lightweight models reaches the state-of-the-art level on ImageNet-1K classification, and improves downstream tasks on object detection and segmentation consistently as well. Our approach also serves as a new tool to compress vision transformers (ViTs).
Paper Structure (28 sections, 4 equations, 3 figures, 6 tables)

This paper contains 28 sections, 4 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Comparative illustration of the AFFNet block's theoretical framework (a) and its practical application (b), highlighting the discrepancy between the conceptual design and the actual implementation.
  • Figure 2: Illustration of AFIDAF in block and stage views.
  • Figure 3: Overview of HAFIDAF acting on Swin Swin_2021 and the resulting compressed hierarchical architecture.