Table of Contents
Fetching ...

GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions

Liang Feng, Zhixuan Shen, Lihua Wen, Shiyao Li, Ming Xu

TL;DR

The Agent Attention module replaces large kernel convolutions, significantly improving computational efficiency while preserving global context modeling, and the Gate-Enhanced Feedforward Block augments feature extraction and processing capabilities, particularly in complex scenes.

Abstract

This paper introduces GateAttentionPose, an innovative approach that enhances the UniRepLKNet architecture for pose estimation tasks. We present two key contributions: the Agent Attention module and the Gate-Enhanced Feedforward Block (GEFB). The Agent Attention module replaces large kernel convolutions, significantly improving computational efficiency while preserving global context modeling. The GEFB augments feature extraction and processing capabilities, particularly in complex scenes. Extensive evaluations on COCO and MPII datasets demonstrate that GateAttentionPose outperforms existing state-of-the-art methods, including the original UniRepLKNet, achieving superior or comparable results with improved efficiency. Our approach offers a robust solution for pose estimation across diverse applications, including autonomous driving, human motion capture, and virtual reality.

GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions

TL;DR

The Agent Attention module replaces large kernel convolutions, significantly improving computational efficiency while preserving global context modeling, and the Gate-Enhanced Feedforward Block augments feature extraction and processing capabilities, particularly in complex scenes.

Abstract

This paper introduces GateAttentionPose, an innovative approach that enhances the UniRepLKNet architecture for pose estimation tasks. We present two key contributions: the Agent Attention module and the Gate-Enhanced Feedforward Block (GEFB). The Agent Attention module replaces large kernel convolutions, significantly improving computational efficiency while preserving global context modeling. The GEFB augments feature extraction and processing capabilities, particularly in complex scenes. Extensive evaluations on COCO and MPII datasets demonstrate that GateAttentionPose outperforms existing state-of-the-art methods, including the original UniRepLKNet, achieving superior or comparable results with improved efficiency. Our approach offers a robust solution for pose estimation across diverse applications, including autonomous driving, human motion capture, and virtual reality.
Paper Structure (17 sections, 5 equations, 2 figures, 3 tables)

This paper contains 17 sections, 5 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The comparison of GateAttentionPose and advanced methods on the COCO test-dev2017 set regarding model size and precision. The size of each bubble represents the input size of the model.
  • Figure 2: The overall network architecture of our GateAttentionPose, as well as the (a) Downsample Block (DSB), the (b) SENet Block (SEBlock), and the (c) Gate-Enhanced Feedforward Block (GEFB).