Table of Contents
Fetching ...

Mining Weighted Sequential Patterns in Incremental Uncertain Databases

Kashob Kumar Roy, Md Hasibul Haque Moon, Md Mahmudur Rahman, Chowdhury Farhan Ahmed, Carson Kai-Sang Leung

TL;DR

This work addresses mining weighted sequential patterns in uncertain, incrementally growing databases by introducing a framework (FUWS) that uses tight upper bounds ($expSup^{cap}$, $wgt^{cap}$) and a hierarchical USeq-Trie to prune candidates. It also presents two incremental strategies, uWSInc and uWSInc+, that reuse prior mined results and maintain additional sets (SFS, PFS) to improve completeness under large or evolving increments. The approach is validated with extensive experiments showing reduced candidate generation, improved runtime, and higher completeness than baselines, across diverse datasets and uncertainty distributions. The proposed methods have practical impact for real-time analytics in domains like medicine, social networks, and transportation where data arrive with uncertainty and evolve over time, enabling scalable, accurate pattern discovery.

Abstract

Due to the rapid development of science and technology, the importance of imprecise, noisy, and uncertain data is increasing at an exponential rate. Thus, mining patterns in uncertain databases have drawn the attention of researchers. Moreover, frequent sequences of items from these databases need to be discovered for meaningful knowledge with great impact. In many real cases, weights of items and patterns are introduced to find interesting sequences as a measure of importance. Hence, a constraint of weight needs to be handled while mining sequential patterns. Besides, due to the dynamic nature of databases, mining important information has become more challenging. Instead of mining patterns from scratch after each increment, incremental mining algorithms utilize previously mined information to update the result immediately. Several algorithms exist to mine frequent patterns and weighted sequences from incremental databases. However, these algorithms are confined to mine the precise ones. Therefore, we have developed an algorithm to mine frequent sequences in an uncertain database in this work. Furthermore, we have proposed two new techniques for mining when the database is incremental. Extensive experiments have been conducted for performance evaluation. The analysis showed the efficiency of our proposed framework.

Mining Weighted Sequential Patterns in Incremental Uncertain Databases

TL;DR

This work addresses mining weighted sequential patterns in uncertain, incrementally growing databases by introducing a framework (FUWS) that uses tight upper bounds (, ) and a hierarchical USeq-Trie to prune candidates. It also presents two incremental strategies, uWSInc and uWSInc+, that reuse prior mined results and maintain additional sets (SFS, PFS) to improve completeness under large or evolving increments. The approach is validated with extensive experiments showing reduced candidate generation, improved runtime, and higher completeness than baselines, across diverse datasets and uncertainty distributions. The proposed methods have practical impact for real-time analytics in domains like medicine, social networks, and transportation where data arrive with uncertainty and evolve over time, enabling scalable, accurate pattern discovery.

Abstract

Due to the rapid development of science and technology, the importance of imprecise, noisy, and uncertain data is increasing at an exponential rate. Thus, mining patterns in uncertain databases have drawn the attention of researchers. Moreover, frequent sequences of items from these databases need to be discovered for meaningful knowledge with great impact. In many real cases, weights of items and patterns are introduced to find interesting sequences as a measure of importance. Hence, a constraint of weight needs to be handled while mining sequential patterns. Besides, due to the dynamic nature of databases, mining important information has become more challenging. Instead of mining patterns from scratch after each increment, incremental mining algorithms utilize previously mined information to update the result immediately. Several algorithms exist to mine frequent patterns and weighted sequences from incremental databases. However, these algorithms are confined to mine the precise ones. Therefore, we have developed an algorithm to mine frequent sequences in an uncertain database in this work. Furthermore, we have proposed two new techniques for mining when the database is incremental. Extensive experiments have been conducted for performance evaluation. The analysis showed the efficiency of our proposed framework.
Paper Structure (34 sections, 6 theorems, 10 equations, 19 figures, 6 tables, 4 algorithms)

This paper contains 34 sections, 6 theorems, 10 equations, 19 figures, 6 tables, 4 algorithms.

Key Result

Lemma 1

The $expSup^{cap}$ of a sequential pattern is always greater than or equal to the actual expected support of that pattern.

Figures (19)

  • Figure 1: Storing frequent sequences in a USeq-Trie
  • Figure 2: After insertion of patterns
  • Figure 3: After deleting $\textless{} (ab)(c) \textgreater{}$ pattern
  • Figure 4: Pattern Maintenance and WES calculation using USeq-Trie
  • Figure 5: Determination of sequences in our proposed $uWSInc+$ architecture.
  • ...and 14 more figures

Theorems & Definitions (14)

  • definition 1
  • definition 2
  • definition 3
  • definition 4
  • Lemma 1
  • Lemma 2
  • definition 5
  • definition 6
  • definition 7
  • Lemma 3
  • ...and 4 more