Table of Contents
Fetching ...

SEAnet: A Deep Learning Architecture for Data Series Similarity Search

Qitong Wang, Themis Palpanas

TL;DR

This work proposes Deep Embedding Approximation (DEA), a novel family of data series summarization techniques based on deep neural networks, and describes SEAnet, a novel architecture especially designed for learning DEA, that introduces the Sum of Squares preservation property into the deep network design.

Abstract

A key operation for massive data series collection analysis is similarity search. According to recent studies, SAX-based indexes offer state-of-the-art performance for similarity search tasks. However, their performance lags under high-frequency, weakly correlated, excessively noisy, or other dataset-specific properties. In this work, we propose Deep Embedding Approximation (DEA), a novel family of data series summarization techniques based on deep neural networks. Moreover, we describe SEAnet, a novel architecture especially designed for learning DEA, that introduces the Sum of Squares preservation property into the deep network design. We further enhance SEAnet with SEAtrans encoder. Finally, we propose novel sampling strategies, SEAsam and SEAsamE, that allow SEAnet to effectively train on massive datasets. Comprehensive experiments on 7 diverse synthetic and real datasets verify the advantages of DEA learned using SEAnet in providing high-quality data series summarizations and similarity search results.

SEAnet: A Deep Learning Architecture for Data Series Similarity Search

TL;DR

This work proposes Deep Embedding Approximation (DEA), a novel family of data series summarization techniques based on deep neural networks, and describes SEAnet, a novel architecture especially designed for learning DEA, that introduces the Sum of Squares preservation property into the deep network design.

Abstract

A key operation for massive data series collection analysis is similarity search. According to recent studies, SAX-based indexes offer state-of-the-art performance for similarity search tasks. However, their performance lags under high-frequency, weakly correlated, excessively noisy, or other dataset-specific properties. In this work, we propose Deep Embedding Approximation (DEA), a novel family of data series summarization techniques based on deep neural networks. Moreover, we describe SEAnet, a novel architecture especially designed for learning DEA, that introduces the Sum of Squares preservation property into the deep network design. We further enhance SEAnet with SEAtrans encoder. Finally, we propose novel sampling strategies, SEAsam and SEAsamE, that allow SEAnet to effectively train on massive datasets. Comprehensive experiments on 7 diverse synthetic and real datasets verify the advantages of DEA learned using SEAnet in providing high-quality data series summarizations and similarity search results.
Paper Structure (16 sections, 2 theorems, 10 equations, 19 figures, 3 tables, 2 algorithms)

This paper contains 16 sections, 2 theorems, 10 equations, 19 figures, 3 tables, 2 algorithms.

Key Result

Lemma 1

Given a z-normalized series dataset $\mathcal{S}$ of size $n$ and its DEAs $\mathcal{E}$, $\mathcal{E}'$ is derived by z-normalizing and then multiplying $\mathcal{E}$ by $\frac{\sqrt{m}}{\sqrt{l}}$. $\mathcal{E}'$'s SoS is the same to $\mathcal{S}$, that is where $\overline{e^i}$ and $\sigma_{e^i}$ are the mean and standard deviation of DEA $e^i$ (without loss of generality, we assume $\sigma_{e

Figures (19)

  • Figure 1: Case studies where PAA and DFT work or fail to approximate and reconstruct series from RandWalk and Deep1B datasets. In both cases, DEA works to approximate and reconstruct series. All summarizations use the same memory budget.
  • Figure 2: Replace PAA by DEA for SAX symbolization.
  • Figure 3: Workflow of DEA-based approximate similarity search.
  • Figure 4: The SEAnet architecture and the details of a dilated full-preactivation ResBlock.
  • Figure 5: The SEAtrans encoder architecture.
  • ...and 14 more figures

Theorems & Definitions (4)

  • Lemma 1
  • proof
  • Lemma 2
  • proof