Anomaly Detection Within Mission-Critical Call Processing
Sean Doris, Iosif Salem, Stefan Schmid
TL;DR
This work addresses anomaly detection for mission-critical call processing in telecom by labeling anomalies via client RTT and training ML models on server-side KPIs measured under SIPp-generated traffic. Seven models are evaluated, with Random Forest delivering the strongest generalization to unseen stressors, achieving F1 near 0.98–1.0 on untrained data. An anomaly threshold is defined from 20 hours of baseline RTT using a third-standard-deviation criterion, enabling consistent labeling without requiring client-side instrumentation. The results demonstrate the viability of server-side KPI-driven anomaly detection to maintain high availability in virtualized telecom environments and point to future work on multi-class anomaly classifications and KPI selection.
Abstract
With increasingly larger and more complex telecommunication networks, there is a need for improved monitoring and reliability. Requirements increase further when working with mission-critical systems requiring stable operations to meet precise design and client requirements while maintaining high availability. This paper proposes a novel methodology for developing a machine learning model that can assist in maintaining availability (through anomaly detection) for client-server communications in mission-critical systems. To that end, we validate our methodology for training models based on data classified according to client performance. The proposed methodology evaluates the use of machine learning to perform anomaly detection of a single virtualized server loaded with simulated network traffic (using SIPp) with media calls. The collected data for the models are classified based on the round trip time performance experienced on the client side to determine if the trained models can detect anomalous client side performance only using key performance indicators available on the server. We compared the performance of seven different machine learning models by testing different trained and untrained test stressor scenarios. In the comparison, five models achieved an F1-score above 0.99 for the trained test scenarios. Random Forest was the only model able to attain an F1-score above 0.9 for all untrained test scenarios with the lowest being 0.980. The results suggest that it is possible to generate accurate anomaly detection to evaluate degraded client-side performance.
