Table of Contents
Fetching ...

LEAD-Drift: Real-time and Explainable Intent Drift Detection by Learning a Data-Driven Risk Score

Md. Kamrul Hossain, Walid Aljoby

TL;DR

LEAD-Drift is demonstrated as a highly effective, interpretable, and operationally efficient solution for proactive network assurance in IBN and enhanced with two key features for operational intelligence.

Abstract

Intent-Based Networking (IBN) simplifies network management, but its reliability is challenged by "intent drift", where the network's state gradually deviates from its intended goal, often leading to silent failures. Conventional approaches struggle to detect the subtle, early stages of intent drift, raising alarms only when degradation is significant and failure is imminent, which limits their effectiveness for proactive assurance. To address this, we propose LEAD-Drift, a framework that detects intent drift in real time to enable proactive failure prevention. LEAD-Drift's core contribution is reformulating intent failure detection as a supervised learning problem by training a lightweight neural network on fixed-horizon labels to predict a future risk score. The model's raw output is then smoothed with an Exponential Moving Average (EMA) and passed through a statistically tuned threshold to generate robust, real-time alerts. Furthermore, we enhance the framework with two key features for operational intelligence: a multi-horizon modeling technique for dynamic time-to-failure estimation, and per-alert explainability using SHAP to identify root-cause KPIs. Our evaluation on a time-series dataset shows LEAD-Drift provides significantly earlier warnings, improving the average lead time by 7.3 minutes (+17.8\%) compared to a distance-based baseline. It also reduces alert noise by 80.2\% compared to a weighted-KPI heuristic, with only a minor trade-off in lead time. These results demonstrate that LEAD-Drift as a highly effective, interpretable, and operationally efficient solution for proactive network assurance in IBN.

LEAD-Drift: Real-time and Explainable Intent Drift Detection by Learning a Data-Driven Risk Score

TL;DR

LEAD-Drift is demonstrated as a highly effective, interpretable, and operationally efficient solution for proactive network assurance in IBN and enhanced with two key features for operational intelligence.

Abstract

Intent-Based Networking (IBN) simplifies network management, but its reliability is challenged by "intent drift", where the network's state gradually deviates from its intended goal, often leading to silent failures. Conventional approaches struggle to detect the subtle, early stages of intent drift, raising alarms only when degradation is significant and failure is imminent, which limits their effectiveness for proactive assurance. To address this, we propose LEAD-Drift, a framework that detects intent drift in real time to enable proactive failure prevention. LEAD-Drift's core contribution is reformulating intent failure detection as a supervised learning problem by training a lightweight neural network on fixed-horizon labels to predict a future risk score. The model's raw output is then smoothed with an Exponential Moving Average (EMA) and passed through a statistically tuned threshold to generate robust, real-time alerts. Furthermore, we enhance the framework with two key features for operational intelligence: a multi-horizon modeling technique for dynamic time-to-failure estimation, and per-alert explainability using SHAP to identify root-cause KPIs. Our evaluation on a time-series dataset shows LEAD-Drift provides significantly earlier warnings, improving the average lead time by 7.3 minutes (+17.8\%) compared to a distance-based baseline. It also reduces alert noise by 80.2\% compared to a weighted-KPI heuristic, with only a minor trade-off in lead time. These results demonstrate that LEAD-Drift as a highly effective, interpretable, and operationally efficient solution for proactive network assurance in IBN.
Paper Structure (25 sections, 5 equations, 6 figures, 1 table)

This paper contains 25 sections, 5 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Illustration of a multi-KPI intent drift scenario across four failure cases (top to bottom: resource leak, slow degradation, sudden crash, and imbalance cases). The legend indicates net_conn (network health), serv_resp (service responsiveness), and cpu_pct (CPU utilization).
  • Figure 2: Example intent drift detection timeline. The upper panel is a bird’s-eye view of the whole timeline, while the zoomed panel shows detailed behavior around one event.
  • Figure 3: SHAP analysis of a healthy test instance. The serv_resp KPI provides a strong negative contribution, correctly driving the risk score down despite minor positive contributions from other KPIs.
  • Figure 4: Multi-horizon risk scores for a drift event. The models for H=120, H=60, and H=30 minutes sequentially cross their individually tuned thresholds (dotted lines) as the failure time (red line) approaches.
  • Figure 5: Performance of LEAD-Drift vs. baselines across varying dataset sizes, showing stable lead time and false positive rates.
  • ...and 1 more figures