SkateboardAI: The Coolest Video Action Recognition for Skateboarding

Hanxiao Chen

SkateboardAI: The Coolest Video Action Recognition for Skateboarding

Hanxiao Chen

TL;DR

The paper tackles automatic recognition of skateboarding tricks from real-world videos by introducing SkateboardAI, a dataset with 15 trick classes collected from diverse sources. It systematically compares uni-modal CNN-LSTM variants, attention-enhanced and Transformer-based pipelines, and a multi-modal I3D architecture for trick classification. The key finding is that a ResNet50-Attention-BiLSTM pipeline achieves about $84\%$ validation accuracy, while Transformer and I3D approaches underperform and require longer training times. The work demonstrates the feasibility of an AI sports referee in skateboarding, and sets the stage for dataset expansion and semi-/unsupervised learning future work.

Abstract

Impressed by the coolest skateboarding sports program from 2021 Tokyo Olympic Games, we are the first to curate the original real-world video datasets "SkateboardAI" in the wild, even self-design and implement diverse uni-modal and multi-modal video action recognition approaches to recognize different tricks accurately. For uni-modal methods, we separately apply (1) CNN and LSTM; (2) CNN and BiLSTM; (3) CNN and BiLSTM with effective attention mechanisms; (4) Transformer-based action recognition pipeline. Transferred to the multi-modal conditions, we investigated the two-stream Inflated-3D architecture on "SkateboardAI" datasets to compare its performance with uni-modal cases. In sum, our objective is developing an excellent AI sport referee for the coolest skateboarding competitions.

SkateboardAI: The Coolest Video Action Recognition for Skateboarding

TL;DR

validation accuracy, while Transformer and I3D approaches underperform and require longer training times. The work demonstrates the feasibility of an AI sports referee in skateboarding, and sets the stage for dataset expansion and semi-/unsupervised learning future work.

Abstract

Paper Structure (11 sections, 12 figures, 4 tables)

This paper contains 11 sections, 12 figures, 4 tables.

Introduction
SkateboardAI Datasets
Diverse Approaches
CNN-LSTM
CNN-BiLSTM
CNN-BiLSTM-Attention
CNN-Attention-BiLSTM
Transformer-based Method
I3D Multi-modal Method
Experiments
Conclusions

Figures (12)

Figure 1: 15 different classes in "SkateboardAI".
Figure 2: Mean video duration for "SkateboardAI".
Figure 3: Video frame number distribution for "SkateboardAI".
Figure 4: CNN-LSTM action recognition pipeline.
Figure 5: CNN-BiLSTM action recognition pipeline.
...and 7 more figures

SkateboardAI: The Coolest Video Action Recognition for Skateboarding

TL;DR

Abstract

SkateboardAI: The Coolest Video Action Recognition for Skateboarding

Authors

TL;DR

Abstract

Table of Contents

Figures (12)