Table of Contents
Fetching ...

Model Selection with a Shapelet-based Distance Measure for Multi-source Transfer Learning in Time Series Classification

Jiseok Lee, Brian Kenji Iwana

TL;DR

This work tackles improving time-series classification with transfer learning by enabling multi-source pre-training and introducing a shapelet-based distance to select source datasets. It combines multiple sources into a single pre-training set, aligns sequence lengths, and uses an innovative shapelet similarity measure (via Matrix Profile) to rank source datasets. The proposed Average Shapelet and Minimum Shapelet distance schemes outperform traditional transferability measures and random source selection on the 128 UCR datasets, with the Minimum Shapelet approach achieving the highest gains. The method offers substantial practical benefits in reduced computation and improved robustness across architectures, making transfer learning for time series more scalable and effective.

Abstract

Transfer learning is a common practice that alleviates the need for extensive data to train neural networks. It is performed by pre-training a model using a source dataset and fine-tuning it for a target task. However, not every source dataset is appropriate for each target dataset, especially for time series. In this paper, we propose a novel method of selecting and using multiple datasets for transfer learning for time series classification. Specifically, our method combines multiple datasets as one source dataset for pre-training neural networks. Furthermore, for selecting multiple sources, our method measures the transferability of datasets based on shapelet discovery for effective source selection. While traditional transferability measures require considerable time for pre-training all the possible sources for source selection of each possible architecture, our method can be repeatedly used for every possible architecture with a single simple computation. Using the proposed method, we demonstrate that it is possible to increase the performance of temporal convolutional neural networks (CNN) on time series datasets.

Model Selection with a Shapelet-based Distance Measure for Multi-source Transfer Learning in Time Series Classification

TL;DR

This work tackles improving time-series classification with transfer learning by enabling multi-source pre-training and introducing a shapelet-based distance to select source datasets. It combines multiple sources into a single pre-training set, aligns sequence lengths, and uses an innovative shapelet similarity measure (via Matrix Profile) to rank source datasets. The proposed Average Shapelet and Minimum Shapelet distance schemes outperform traditional transferability measures and random source selection on the 128 UCR datasets, with the Minimum Shapelet approach achieving the highest gains. The method offers substantial practical benefits in reduced computation and improved robustness across architectures, making transfer learning for time series more scalable and effective.

Abstract

Transfer learning is a common practice that alleviates the need for extensive data to train neural networks. It is performed by pre-training a model using a source dataset and fine-tuning it for a target task. However, not every source dataset is appropriate for each target dataset, especially for time series. In this paper, we propose a novel method of selecting and using multiple datasets for transfer learning for time series classification. Specifically, our method combines multiple datasets as one source dataset for pre-training neural networks. Furthermore, for selecting multiple sources, our method measures the transferability of datasets based on shapelet discovery for effective source selection. While traditional transferability measures require considerable time for pre-training all the possible sources for source selection of each possible architecture, our method can be repeatedly used for every possible architecture with a single simple computation. Using the proposed method, we demonstrate that it is possible to increase the performance of temporal convolutional neural networks (CNN) on time series datasets.
Paper Structure (23 sections, 5 equations, 7 figures, 1 table)

This paper contains 23 sections, 5 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: An illustration of our multi-source transfer learning. Source datasets $\mathcal{S}_i$ are selected using a transferability measure, and the neural network is pre-trained. The trained weights are then fine-tuned using target dataset $\mathcal{T}$.
  • Figure 2: Examples of shapelets from the Arrowhead dataset. The left and right figures are three time series patterns from the same classes.
  • Figure 3: Overview of Average Shapelet and Minimum Shapelet.
  • Figure 4: Sample plot of datasets of UCR Archive. Random three samples are plotted from two classes. The upper three datasets performed better with our proposed method, and the lower three datasets showed adverse effects by adopting our proposed method.
  • Figure 5: A Nemenyi post-hoc test diagram. The proposed methods are in green. The numbers indicate the average rank when tested on all 128 datasets.
  • ...and 2 more figures