Table of Contents
Fetching ...

SoftED: Metrics for Soft Evaluation of Time Series Event Detection

Rebecca Salles, Janio Lima, Michel Reis, Rafaelli Coutinho, Esther Pacitti, Florent Masseglia, Reza Akbarinia, Chao Chen, Jonathan Garibaldi, Fabio Porto, Eduardo Ogasawara

TL;DR

SoftED introduces a temporal-tolerance, fuzzy-time evaluation framework for time series event detection. It defines an event membership function $\mu_{e_j}(t)$ with a tolerance $k$ to quantify how closely a detection relates to an event and assigns detections via a single-entity attribution rule, yielding soft scores $ds(d_i)$ and soft metric counts $TP_s$, $FP_s$, $TN_s$, and $FN_s$. The approach preserves the interpretability of traditional hard metrics while rewarding near-misses and proximal detections, and is complemented by a competency-question–based evaluation protocol. Quantitative and qualitative analyses show SoftED increases evaluability in many cases (e.g., up to about 36% more evaluations with temporal tolerance) and commonly aligns with domain experts on detector suitability, offering practical benefits for method selection and deployment in real-world monitoring scenarios.

Abstract

Time series event detection methods are evaluated mainly by standard classification metrics that focus solely on detection accuracy. However, inaccuracy in detecting an event can often result from its preceding or delayed effects reflected in neighboring detections. These detections are valuable to trigger necessary actions or help mitigate unwelcome consequences. In this context, current metrics are insufficient and inadequate for the context of event detection. There is a demand for metrics that incorporate both the concept of time and temporal tolerance for neighboring detections. This paper introduces SoftED metrics, a new set of metrics designed for soft evaluating event detection methods. They enable the evaluation of both detection accuracy and the degree to which their detections represent events. They improved event detection evaluation by associating events and their representative detections, incorporating temporal tolerance in over 36\% of experiments compared to the usual classification metrics. SoftED metrics were validated by domain specialists that indicated their contribution to detection evaluation and method selection.

SoftED: Metrics for Soft Evaluation of Time Series Event Detection

TL;DR

SoftED introduces a temporal-tolerance, fuzzy-time evaluation framework for time series event detection. It defines an event membership function with a tolerance to quantify how closely a detection relates to an event and assigns detections via a single-entity attribution rule, yielding soft scores and soft metric counts , , , and . The approach preserves the interpretability of traditional hard metrics while rewarding near-misses and proximal detections, and is complemented by a competency-question–based evaluation protocol. Quantitative and qualitative analyses show SoftED increases evaluability in many cases (e.g., up to about 36% more evaluations with temporal tolerance) and commonly aligns with domain experts on detector suitability, offering practical benefits for method selection and deployment in real-world monitoring scenarios.

Abstract

Time series event detection methods are evaluated mainly by standard classification metrics that focus solely on detection accuracy. However, inaccuracy in detecting an event can often result from its preceding or delayed effects reflected in neighboring detections. These detections are valuable to trigger necessary actions or help mitigate unwelcome consequences. In this context, current metrics are insufficient and inadequate for the context of event detection. There is a demand for metrics that incorporate both the concept of time and temporal tolerance for neighboring detections. This paper introduces SoftED metrics, a new set of metrics designed for soft evaluating event detection methods. They enable the evaluation of both detection accuracy and the degree to which their detections represent events. They improved event detection evaluation by associating events and their representative detections, incorporating temporal tolerance in over 36\% of experiments compared to the usual classification metrics. SoftED metrics were validated by domain specialists that indicated their contribution to detection evaluation and method selection.
Paper Structure (31 sections, 10 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 31 sections, 10 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Example regarding the problem of evaluating the detection of an event at time $t$. Detector A detects an event at time $t+k_1$, while Detector B detects an event at time $t-k_2$ ($k_2>k_1$).
  • Figure 2: The general idea behind the proposed approach compares the standard "hard" evaluation and the "soft" evaluation of the event detection.
  • Figure 3: Auxiliary plots for comprehension of SoftED. (a) represents an event membership function $\mu_{e_j}(t)$. (b) represents $\mu_{e_j}(t)$ for detections $d_1$ and $d_2$. (c) depicts the example scenario containing one detection to many events, motivating the first constraint of SoftED. (d) depicts the example scenario containing many detections to a single event, motivating the second constraint of SoftED.
  • Figure 4: Incorporated temporal tolerance from SoftED F1 metric evaluation of event detectors compared to hard F1 metric.
  • Figure 5: Changes in the ranking of top evaluated event detectors based on the SoftED F1 metric compared to hard F1 metric
  • ...and 2 more figures