Table of Contents
Fetching ...

Unsupervised Discovery of Long-Term Spatiotemporal Periodic Workflows in Human Activities

Fan Yang, Quanting Xie, Atsunori Moteki, Shoichi Masui, Shan Jiang, Kanji Uchino, Yonatan Bisk, Graham Neubig

TL;DR

The paper tackles the challenge of discovering long-term periodic spatiotemporal workflows in human activities, where periods are extended and low-contrast. It introduces a public benchmark of $580$ multimodal sequences spanning diverse domains and three evaluation tasks: unsupervised period detection, task completion tracking, and anomaly localization. A training-free baseline based on spatiotemporal tokenization into $K$ clusters, a $2$D FFT with context marginalization for initial period estimation, and a Multiple Transcript Alignment yields precise period boundaries and a unified workflow, outperforming both unsupervised methods and LLM-based baselines across tasks. The work demonstrates real-world applicability through factory deployment and discusses deployment-cost advantages, laying a foundation for future research in long-term periodic human activity analysis.

Abstract

Periodic human activities with implicit workflows are common in manufacturing, sports, and daily life. While short-term periodic activities -- characterized by simple structures and high-contrast patterns -- have been widely studied, long-term periodic workflows with low-contrast patterns remain largely underexplored. To bridge this gap, we introduce the first benchmark comprising 580 multimodal human activity sequences featuring long-term periodic workflows. The benchmark supports three evaluation tasks aligned with real-world applications: unsupervised periodic workflow detection, task completion tracking, and procedural anomaly detection. We also propose a lightweight, training-free baseline for modeling diverse periodic workflow patterns. Experiments show that: (i) our benchmark presents significant challenges to both unsupervised periodic detection methods and zero-shot approaches based on powerful large language models (LLMs); (ii) our baseline outperforms competing methods by a substantial margin in all evaluation tasks; and (iii) in real-world applications, our baseline demonstrates deployment advantages on par with traditional supervised workflow detection approaches, eliminating the need for annotation and retraining. Our project page is https://sites.google.com/view/periodicworkflow.

Unsupervised Discovery of Long-Term Spatiotemporal Periodic Workflows in Human Activities

TL;DR

The paper tackles the challenge of discovering long-term periodic spatiotemporal workflows in human activities, where periods are extended and low-contrast. It introduces a public benchmark of multimodal sequences spanning diverse domains and three evaluation tasks: unsupervised period detection, task completion tracking, and anomaly localization. A training-free baseline based on spatiotemporal tokenization into clusters, a D FFT with context marginalization for initial period estimation, and a Multiple Transcript Alignment yields precise period boundaries and a unified workflow, outperforming both unsupervised methods and LLM-based baselines across tasks. The work demonstrates real-world applicability through factory deployment and discusses deployment-cost advantages, laying a foundation for future research in long-term periodic human activity analysis.

Abstract

Periodic human activities with implicit workflows are common in manufacturing, sports, and daily life. While short-term periodic activities -- characterized by simple structures and high-contrast patterns -- have been widely studied, long-term periodic workflows with low-contrast patterns remain largely underexplored. To bridge this gap, we introduce the first benchmark comprising 580 multimodal human activity sequences featuring long-term periodic workflows. The benchmark supports three evaluation tasks aligned with real-world applications: unsupervised periodic workflow detection, task completion tracking, and procedural anomaly detection. We also propose a lightweight, training-free baseline for modeling diverse periodic workflow patterns. Experiments show that: (i) our benchmark presents significant challenges to both unsupervised periodic detection methods and zero-shot approaches based on powerful large language models (LLMs); (ii) our baseline outperforms competing methods by a substantial margin in all evaluation tasks; and (iii) in real-world applications, our baseline demonstrates deployment advantages on par with traditional supervised workflow detection approaches, eliminating the need for annotation and retraining. Our project page is https://sites.google.com/view/periodicworkflow.

Paper Structure

This paper contains 9 sections, 9 equations, 7 figures, 6 tables, 2 algorithms.

Figures (7)

  • Figure 1: Periodic spatiotemporal activity in a yoga example. While existing studies focus on short-term periods with simple structures and high-contrast patterns, we investigate a novel direction on long-term periods that involve complex workflows with low-contrast patterns.
  • Figure 2: Left: Illustration of our benchmark. TopRight: Statistics of our dataset. BottomRight: Annotation of our benchmark. We collected 580 long-term workflows of periodic human activities, characterized by a compact dataset yet encompassing a wide variety of real-world periodic tasks, including factory production, exercise training, and shuttle routes.
  • Figure 3: Our baseline method. In Step 1, we construct an activity transcript using soft tokens to determine the initial period window size ($w$). In Step 2, we apply our sequential mining algorithm to extract the workflow and identify the period boundaries.
  • Figure 4: Ablation studies with various $K$. Larger values improve performance at the expense of higher computational cost.
  • Figure 5: An example of workflows generated for the same activity sequence with different values of $K$. The symbol '_' represents a skipped token. As $K$ increases, the workflows become more detailed and extended.
  • ...and 2 more figures