Experimental Demonstration of Online Learning-Based Concept Drift Adaptation for Failure Detection in Optical Networks

Yousuf Moiz Ali; Jaroslaw E. Prilepsky; João Pedro; Antonio Napoli; Sasipim Srivallapanondh; Sergei K. Turitsyn; Pedro Freire

Experimental Demonstration of Online Learning-Based Concept Drift Adaptation for Failure Detection in Optical Networks

Yousuf Moiz Ali, Jaroslaw E. Prilepsky, João Pedro, Antonio Napoli, Sasipim Srivallapanondh, Sergei K. Turitsyn, Pedro Freire

TL;DR

The paper addresses concept drift in ML-based optical-network failure detection and proposes online learning as a dynamic adaptation mechanism. It implements online updates for Adaptive Random Forest, Logistic Regression, and Naive Bayes, guided by Page-Hinkley drift detection, to handle hard-failure events in a streaming setting. The approach yields up to $70\%$ improvements in rolling accuracy and an AUC near $0.75$, with latency below $1$ ms, demonstrating practical viability for live networks. The findings indicate that online CD adaptation is model-agnostic and broadly applicable to evolving telemetry in optical networks.

Abstract

We present a novel online learning-based approach for concept drift adaptation in optical network failure detection, achieving up to a 70% improvement in performance over conventional static models while maintaining low latency.

Experimental Demonstration of Online Learning-Based Concept Drift Adaptation for Failure Detection in Optical Networks

TL;DR

improvements in rolling accuracy and an AUC near

, with latency below

ms, demonstrating practical viability for live networks. The findings indicate that online CD adaptation is model-agnostic and broadly applicable to evolving telemetry in optical networks.

Abstract

Paper Structure (4 sections, 3 figures, 1 table)

This paper contains 4 sections, 3 figures, 1 table.

Introduction
Methodology -- Online Training to Alleviate Concept Drift
Results and Discussion
Conclusion

Figures (3)

Figure 1: (a) Experimental testbed setup for failure detection where the Wavelength Selective Switch is used to introduce attenuation in OA1 to simulate normal and failure conditions. (b) Methodology used to test the static and online models. Assuming both the static and online models have been trained in a batch manner on SFD, the static model would only predict on the new sample, while the online model would predict and update the model with the new sample from the HFD.
Figure 2: a) Data distribution for OSNR_SPO2 feature for the combined dataset of soft and hard failures, showing the drop in OSNR is higher for HFD than SFD after a period of normal instances. b) Plot showing specific areas where drift has occurred in the OSNR_SPO2 feature with green lines showing drift in the normal class (without failure) and red lines showing drift in the failure class.
Figure 3: Rolling accuracy and AUC plots on the HFD. The shaded regions represent those areas where the online models maintain or improve performance during drift regions, whereas static models degrade in performance. The arrow represents the point after which synthetic failure samples were available to show that if more failure samples are added, the online model regains accuracy to 100% after the earlier dip, but the static model further degrades.

Experimental Demonstration of Online Learning-Based Concept Drift Adaptation for Failure Detection in Optical Networks

TL;DR

Abstract

Experimental Demonstration of Online Learning-Based Concept Drift Adaptation for Failure Detection in Optical Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (3)