Combining Boundary Supervision and Segment-Level Regularization for Fine-Grained Action Segmentation

Hinako Mitsuoka; Kazuhiro Hotta

Combining Boundary Supervision and Segment-Level Regularization for Fine-Grained Action Segmentation

Hinako Mitsuoka, Kazuhiro Hotta

Abstract

Recent progress in Temporal Action Segmentation (TAS) has increasingly relied on complex architectures, which can hinder practical deployment. We present a lightweight dual-loss training framework that improves fine-grained segmentation quality with only one additional output channel and two auxiliary loss terms, requiring minimal architectural modification. Our approach combines a boundary-regression loss that promotes accurate temporal localization via a single-channel boundary prediction and a CDF-based segment-level regularization loss that encourages coherent within-segment structure by matching cumulative distributions over predicted and ground-truth segments. The framework is architecture-agnostic and can be integrated into existing TAS models (e.g., MS-TCN, C2F-TCN, FACT) as a training-time loss function. Across three benchmark datasets, the proposed method improves segment-level consistency and boundary quality, yielding higher F1 and Edit scores across three different models. Frame-wise accuracy remains largely unchanged, highlighting that precise segmentation can be achieved through simple loss design rather than heavier architectures or inference-time refinements.

Combining Boundary Supervision and Segment-Level Regularization for Fine-Grained Action Segmentation

Abstract

Combining Boundary Supervision and Segment-Level Regularization for Fine-Grained Action Segmentation

Abstract

Paper Structure

Table of Contents

Figures (7)