Table of Contents
Fetching ...

Distributed Seasonal Temporal Pattern Mining

Van Ho-Long, Nguyen Ho, Anh-Vu Dinh-Duc, Ha Manh Tran, Ky Trung Nguyen, Tran Dung Pham, Quoc Viet Hung Nguyen

TL;DR

This work addresses the challenge of mining seasonal temporal patterns (STPs) from large time-series collections, where standard support and anti-monotonicity fail to prune the search space. It introduces DSTPM, the first distributed framework for STP mining, which uses distributed hierarchical lookup hash structures (DHLH) and a maxSeason-based pruning criterion to enable scalable, memory-efficient discovery of frequent seasonal patterns. The method operates in two steps—Seasonal Single Event Mining and Seasonal k-event Pattern Mining—utilizing Spark-based processing and multi-level lookup structures to manage candidate events and patterns. Experimental results on real-world and synthetic data show significant runtime and memory advantages over sequential baselines and strong scalability with increasing data sizes and cluster resources, underscoring the practical impact for IoT-driven time-series analysis.

Abstract

The explosive growth of IoT-enabled sensors is producing enormous amounts of time series data across many domains, offering valuable opportunities to extract insights through temporal pattern mining. Among these patterns, an important class exhibits periodic occurrences, referred to as \textit{seasonal temporal patterns} (STPs). However, mining STPs poses challenges, as traditional measures such as support and confidence cannot capture seasonality, and the lack of the anti-monotonicity property results in an exponentially large search space. Existing STP mining methods operate sequentially and therefore do not scale to large datasets. In this paper, we propose the Distributed Seasonal Temporal Pattern Mining (DSTPM), the first distributed framework for mining seasonal temporal patterns from time series. DSTPM leverages efficient data structures, specifically distributed hierarchical lookup hash structures, to enable efficient computation. Extensive experimental evaluations demonstrate that DSTPM significantly outperforms sequential baselines in runtime and memory usage, while scaling effectively to very large datasets.

Distributed Seasonal Temporal Pattern Mining

TL;DR

This work addresses the challenge of mining seasonal temporal patterns (STPs) from large time-series collections, where standard support and anti-monotonicity fail to prune the search space. It introduces DSTPM, the first distributed framework for STP mining, which uses distributed hierarchical lookup hash structures (DHLH) and a maxSeason-based pruning criterion to enable scalable, memory-efficient discovery of frequent seasonal patterns. The method operates in two steps—Seasonal Single Event Mining and Seasonal k-event Pattern Mining—utilizing Spark-based processing and multi-level lookup structures to manage candidate events and patterns. Experimental results on real-world and synthetic data show significant runtime and memory advantages over sequential baselines and strong scalability with increasing data sizes and cluster resources, underscoring the practical impact for IoT-driven time-series analysis.

Abstract

The explosive growth of IoT-enabled sensors is producing enormous amounts of time series data across many domains, offering valuable opportunities to extract insights through temporal pattern mining. Among these patterns, an important class exhibits periodic occurrences, referred to as \textit{seasonal temporal patterns} (STPs). However, mining STPs poses challenges, as traditional measures such as support and confidence cannot capture seasonality, and the lack of the anti-monotonicity property results in an exponentially large search space. Existing STP mining methods operate sequentially and therefore do not scale to large datasets. In this paper, we propose the Distributed Seasonal Temporal Pattern Mining (DSTPM), the first distributed framework for mining seasonal temporal patterns from time series. DSTPM leverages efficient data structures, specifically distributed hierarchical lookup hash structures, to enable efficient computation. Extensive experimental evaluations demonstrate that DSTPM significantly outperforms sequential baselines in runtime and memory usage, while scaling effectively to very large datasets.

Paper Structure

This paper contains 16 sections, 2 theorems, 1 equation, 9 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

Let $P$ and $P^{'}$ be two temporal patterns such that $P^{'} \subseteq P$. Then $\textit{maxSeason}(P^{'}) \geq \textit{maxSeason}(P)$.

Figures (9)

  • Figure 1:
  • Figure 3: A hierarchical lookup hash tables $DHLH_1$
  • Figure 4: A hierarchical lookup hash tables $DHLH_2$
  • Figure 5: Runtime Comparison on RE (real-world)
  • Figure 6: Runtime Comparison on SC (real-world)
  • ...and 4 more figures

Theorems & Definitions (2)

  • Lemma 1
  • Lemma 2