Table of Contents
Fetching ...

Deep Reinforcement Learning based Triggering Function for Early Classifiers of Time Series

Aurélien Renault, Alexis Bondu, Antoine Cornuéjols, Vincent Lemaire

TL;DR

This work addresses Early Classification of Time Series (ECTS) by translating the triggering decision problem into a Reinforcement Learning (RL) framework for separable architectures and introducing Alert, a Deep Q-Network based triggering function. By using the same feature sets as hand-crafted rules, the authors enable a fair comparison between man-tailored and RL-based triggering and demonstrate that Alert_star, a richer state-space variant, consistently outperforms state-of-the-art approaches across 31 datasets and a range of misclassification–delay costs. The results show that larger, well-chosen state spaces empower RL to learn more effective, non-linear triggering rules, though at the cost of interpretability. The study highlights the potential of RL to discover improved triggering policies for ECTS and suggests further work on explainability and broader state-space design to balance performance with transparency.

Abstract

Early Classification of Time Series (ECTS) has been recognized as an important problem in many areas where decisions have to be taken as soon as possible, before the full data availability, while time pressure increases. Numerous ECTS approaches have been proposed, based on different triggering functions, each taking into account various pieces of information related to the incoming time series and/or the output of a classifier. Although their performances have been empirically compared in the literature, no studies have been carried out on the optimality of these triggering functions that involve ``man-tailored'' decision rules. Based on the same information, could there be better triggering functions? This paper presents one way to investigate this question by showing first how to translate ECTS problems into Reinforcement Learning (RL) ones, where the very same information is used in the state space. A thorough comparison of the performance obtained by ``handmade'' approaches and their ``RL-based'' counterparts has been carried out. A second question investigated in this paper is whether a different combination of information, defining the state space in RL systems, can achieve even better performance. Experiments show that the system we describe, called \textsc{Alert}, significantly outperforms its state-of-the-art competitors on a large number of datasets.

Deep Reinforcement Learning based Triggering Function for Early Classifiers of Time Series

TL;DR

This work addresses Early Classification of Time Series (ECTS) by translating the triggering decision problem into a Reinforcement Learning (RL) framework for separable architectures and introducing Alert, a Deep Q-Network based triggering function. By using the same feature sets as hand-crafted rules, the authors enable a fair comparison between man-tailored and RL-based triggering and demonstrate that Alert_star, a richer state-space variant, consistently outperforms state-of-the-art approaches across 31 datasets and a range of misclassification–delay costs. The results show that larger, well-chosen state spaces empower RL to learn more effective, non-linear triggering rules, though at the cost of interpretability. The study highlights the potential of RL to discover improved triggering policies for ECTS and suggests further work on explainability and broader state-space design to balance performance with transparency.

Abstract

Early Classification of Time Series (ECTS) has been recognized as an important problem in many areas where decisions have to be taken as soon as possible, before the full data availability, while time pressure increases. Numerous ECTS approaches have been proposed, based on different triggering functions, each taking into account various pieces of information related to the incoming time series and/or the output of a classifier. Although their performances have been empirically compared in the literature, no studies have been carried out on the optimality of these triggering functions that involve ``man-tailored'' decision rules. Based on the same information, could there be better triggering functions? This paper presents one way to investigate this question by showing first how to translate ECTS problems into Reinforcement Learning (RL) ones, where the very same information is used in the state space. A thorough comparison of the performance obtained by ``handmade'' approaches and their ``RL-based'' counterparts has been carried out. A second question investigated in this paper is whether a different combination of information, defining the state space in RL systems, can achieve even better performance. Experiments show that the system we describe, called \textsc{Alert}, significantly outperforms its state-of-the-art competitors on a large number of datasets.

Paper Structure

This paper contains 28 sections, 10 equations, 12 figures, 3 tables.

Figures (12)

  • Figure 1: Different architectures for the ECTS problem. The top ones, separable and non separable, involve a man-tailored decision rule, whereas the bottom one does not rely on it.
  • Figure 2: Heatmap representing decision rule $r$ on the ChilledWaterPredictor dataset, learned by Stopping Rule (\ref{['r_sr']}) and using RL (\ref{['r_dqn_sr']}), based on (i) the maximum probability estimated by $h$, in $y$-axis and (ii) the proportion seen of the time series, in $x$-axis (see Section \ref{['sec:expe']}). Red lines delimit areas where the probability of triggering, estimated by a sigmoid function, is above 0.5.
  • Figure 3: Pairwise comparison of SOTA methods versus their RL counterpart using same information as input. The blue curve, ranging from 0 to 1, represents the win rate of the man-tailored method over full benchmark. The orange curve, ranging from -1 to 1, represents the difference of AvgCost between base and RL counterparts, normalized by $\textit{AvgCost}^{\star}$, occurring at the best triggering time. In both cases, points above the horizontal line indicates that the man-tailored method is better than its RL-based counterpart.
  • Figure 4: The ranking plot \ref{['bump_sota']} shows that, across all $\alpha$, Alert$^{\star}$ dominates all competitors. This result is significant as supported by statistical tests as shown in plot \ref{['cd_sota']} for $\alpha = 0.8$.
  • Figure 5: Pareto front, displaying for each $\alpha$, the normalized version of the AvgCost, decomposed over delay and misclassification cost on $x$-axis and $y$-axis respectively. Best approaches are located on the top left corner. High $\alpha$ values are located on the right, low ones on the left. Due to the exponential shape of the delay cost, the $x$-axis is on log scale.
  • ...and 7 more figures