Table of Contents
Fetching ...

FALCON: A Framework for Fault Prediction in Open RAN Using Multi-Level Telemetry

Yaswanth Kumar LS, Somya Jain, Bheemarjuna Reddy Tamma, Koteswararao Kondepu

TL;DR

This work addresses the challenge of fault management in Open RAN by proposing FALCON, a holistic framework that fuses multi-level telemetry from infrastructure, platform, and RAN layers. It extracts a compact representation via PCA, forecasts KPI trajectories with an LSTM forecaster, and classifies imminent faults using a Random Forest model, enabling proactive mitigation. In a virtualized O-RAN testbed with fault-injection experiments, FALCON achieves an average accuracy of $98.73\%$ and a F1-score of $98.71\%$, demonstrating robust predictive capability across CPU stress, memory stress, and packet-loss faults. The approach advances self-resilience in Open RAN by delivering timely, actionable insights across the full stack and confirms the value of combining multi-level telemetry with sequential forecasting and ensemble classification for proactive fault management.

Abstract

O-RAN has brought in deployment flexibility and intelligent RAN control for mobile operators through its disaggregated and modular architecture using open interfaces. However, this disaggregation introduces complexities in system integration and network management, as components are often sourced from different vendors. In addition, the operators who are relying on open source and virtualized components -- which are deployed on commodity hardware -- require additional resilient solutions as O-RAN deployments suffer from the risk of failures at multiple levels including infrastructure, platform, and RAN levels. To address these challenges, this paper proposes FALCON, a fault prediction framework for O-RAN, which leverages infrastructure-, platform-, and RAN-level telemetry to predict faults in virtualized O-RAN deployments. By aggregating and analyzing metrics from various components at different levels using AI/ML models, the FALCON framework enables proactive fault management, providing operators with actionable insights to implement timely preventive measures. The FALCON framework, using a Random Forest classifier, outperforms two other classifiers on the predicted telemetry, achieving an average accuracy and F1-score of more than 98%.

FALCON: A Framework for Fault Prediction in Open RAN Using Multi-Level Telemetry

TL;DR

This work addresses the challenge of fault management in Open RAN by proposing FALCON, a holistic framework that fuses multi-level telemetry from infrastructure, platform, and RAN layers. It extracts a compact representation via PCA, forecasts KPI trajectories with an LSTM forecaster, and classifies imminent faults using a Random Forest model, enabling proactive mitigation. In a virtualized O-RAN testbed with fault-injection experiments, FALCON achieves an average accuracy of and a F1-score of , demonstrating robust predictive capability across CPU stress, memory stress, and packet-loss faults. The approach advances self-resilience in Open RAN by delivering timely, actionable insights across the full stack and confirms the value of combining multi-level telemetry with sequential forecasting and ensemble classification for proactive fault management.

Abstract

O-RAN has brought in deployment flexibility and intelligent RAN control for mobile operators through its disaggregated and modular architecture using open interfaces. However, this disaggregation introduces complexities in system integration and network management, as components are often sourced from different vendors. In addition, the operators who are relying on open source and virtualized components -- which are deployed on commodity hardware -- require additional resilient solutions as O-RAN deployments suffer from the risk of failures at multiple levels including infrastructure, platform, and RAN levels. To address these challenges, this paper proposes FALCON, a fault prediction framework for O-RAN, which leverages infrastructure-, platform-, and RAN-level telemetry to predict faults in virtualized O-RAN deployments. By aggregating and analyzing metrics from various components at different levels using AI/ML models, the FALCON framework enables proactive fault management, providing operators with actionable insights to implement timely preventive measures. The FALCON framework, using a Random Forest classifier, outperforms two other classifiers on the predicted telemetry, achieving an average accuracy and F1-score of more than 98%.

Paper Structure

This paper contains 10 sections, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Solution Architecture of FALCON Framework.
  • Figure 2: Detailed ML pipeline used in FALCON.
  • Figure 3: Experimental Testbed.
  • Figure 4: Traffic distribution generated by ping messages.
  • Figure 5: Average Performance of Different Classifiers in FALCON Framework Across All Folds.