HyLiFormer: Hyperbolic Linear Attention for Skeleton-based Human Action Recognition

Yue Li; Haoxuan Qu; Mengyuan Liu; Jun Liu; Yujun Cai

HyLiFormer: Hyperbolic Linear Attention for Skeleton-based Human Action Recognition

Yue Li, Haoxuan Qu, Mengyuan Liu, Jun Liu, Yujun Cai

TL;DR

HyLiFormer addresses the challenge of quadratic computational cost in Transformer-based skeleton HAR by introducing a hyperbolic linear attention framework. It maps Euclidean skeleton data into the Poincaré model via the Hyperbolic Transformation with Curvatures (HTC) and performs attention with Hyperbolic Linear Attention (HLA), achieving a theoretical and practical complexity of $O(N)$ while modeling hierarchical joint structures. Empirical results on NTU RGB+D and NTU RGB+D 120 show competitive accuracy (e.g., ~87.5% on X-Sub120) with substantially reduced training time, and ablations identify $\kappa=-1$ as optimal and highlight the limitations of directly applying Euclidean linear attention in skeleton HAR. The work provides a scalable, geometry-aware transformer for real-world HAR applications, demonstrating the benefits of combining hyperbolic geometry with linear attention for hierarchical sequence data.

Abstract

Transformers have demonstrated remarkable performance in skeleton-based human action recognition, yet their quadratic computational complexity remains a bottleneck for real-world applications. To mitigate this, linear attention mechanisms have been explored but struggle to capture the hierarchical structure of skeleton data. Meanwhile, the Poincaré model, as a typical hyperbolic geometry, offers a powerful framework for modeling hierarchical structures but lacks well-defined operations for existing mainstream linear attention. In this paper, we propose HyLiFormer, a novel hyperbolic linear attention Transformer tailored for skeleton-based action recognition. Our approach incorporates a Hyperbolic Transformation with Curvatures (HTC) module to map skeleton data into hyperbolic space and a Hyperbolic Linear Attention (HLA) module for efficient long-range dependency modeling. Theoretical analysis and extensive experiments on NTU RGB+D and NTU RGB+D 120 datasets demonstrate that HyLiFormer significantly reduces computational complexity while preserving model accuracy, making it a promising solution for efficiency-critical applications.

HyLiFormer: Hyperbolic Linear Attention for Skeleton-based Human Action Recognition

TL;DR

Abstract

HyLiFormer: Hyperbolic Linear Attention for Skeleton-based Human Action Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (2)