Table of Contents
Fetching ...

An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks

Khaleda Papry, Francesco Spinnato, Marco Fiore, Mirco Nanni, Israat Haque

TL;DR

This work introduces Prometheus, an explainable framework for 5G RAN radio link failure prediction that combines a model-agnostic SHAP-based explainer with an explainability-guided simplification loop. By applying local/global aggregation of SHAP contributions, Prometheus identifies the most informative features and prunes both input features and network architecture, producing lightweight predictors. On real rural and urban Turkcell datasets, the framework yields substantial parameter reductions (e.g., from $13.4\mathrm{K}$ to $5.8\mathrm{K}$ for GenTrap-derived models, and from $154.7\mathrm{K}$ to $11\mathrm{K}$ for LSTM+) while achieving equal or improved F1-scores, with weather context contributing minimally to predictions. The results highlight that high predictive performance can be achieved with a reduced feature set focused on RL KPIs, improving interpretability, scalability, and deployability for 5G RAN operations. Future work will explore longer time-series, counterfactual explanations, and broader applicability across network management tasks.

Abstract

As 5G networks continue to evolve to deliver high speed, low latency, and reliable communications, ensuring uninterrupted service has become increasingly critical. While millimeter wave (mmWave) frequencies enable gigabit data rates, they are highly susceptible to environmental factors, often leading to radio link failures (RLF). Predictive models leveraging radio and weather data have been proposed to address this issue; however, many operate as black boxes, offering limited transparency for operational deployment. This work bridges that gap by introducing a framework that combines explainability based feature pruning with model refinement. Our framework can be integrated into state of the art predictors such as GNN Transformer and LSTM based architectures for RLF prediction, enabling the development of accurate and explainability guided models in 5G networks. It provides insights into the contribution of input features and the decision making logic of neural networks, leading to lighter and more scalable models. When applied to RLF prediction, our framework unveils that weather data contributes minimally to the forecast in extensive real world datasets, which informs the design of a leaner model with 50 percent fewer parameters and improved F1 scores with respect to the state of the art solution. Ultimately, this work empowers network providers to evaluate and refine their neural network based prediction models for better interpretability, scalability, and performance.

An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks

TL;DR

This work introduces Prometheus, an explainable framework for 5G RAN radio link failure prediction that combines a model-agnostic SHAP-based explainer with an explainability-guided simplification loop. By applying local/global aggregation of SHAP contributions, Prometheus identifies the most informative features and prunes both input features and network architecture, producing lightweight predictors. On real rural and urban Turkcell datasets, the framework yields substantial parameter reductions (e.g., from to for GenTrap-derived models, and from to for LSTM+) while achieving equal or improved F1-scores, with weather context contributing minimally to predictions. The results highlight that high predictive performance can be achieved with a reduced feature set focused on RL KPIs, improving interpretability, scalability, and deployability for 5G RAN operations. Future work will explore longer time-series, counterfactual explanations, and broader applicability across network management tasks.

Abstract

As 5G networks continue to evolve to deliver high speed, low latency, and reliable communications, ensuring uninterrupted service has become increasingly critical. While millimeter wave (mmWave) frequencies enable gigabit data rates, they are highly susceptible to environmental factors, often leading to radio link failures (RLF). Predictive models leveraging radio and weather data have been proposed to address this issue; however, many operate as black boxes, offering limited transparency for operational deployment. This work bridges that gap by introducing a framework that combines explainability based feature pruning with model refinement. Our framework can be integrated into state of the art predictors such as GNN Transformer and LSTM based architectures for RLF prediction, enabling the development of accurate and explainability guided models in 5G networks. It provides insights into the contribution of input features and the decision making logic of neural networks, leading to lighter and more scalable models. When applied to RLF prediction, our framework unveils that weather data contributes minimally to the forecast in extensive real world datasets, which informs the design of a leaner model with 50 percent fewer parameters and improved F1 scores with respect to the state of the art solution. Ultimately, this work empowers network providers to evaluate and refine their neural network based prediction models for better interpretability, scalability, and performance.
Paper Structure (15 sections, 3 equations, 11 figures, 5 tables)

This paper contains 15 sections, 3 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Prometheus architecture with three components: a black-box DNN, an explainer, and an explainability-guided simplified DNN. The explainer includes four modules: model-agnostic explainer, Local Aggregation (LA), Global Aggregation (GA), and feature pruning. The output is a simplified DNN with reduced components.
  • Figure 2: Transformer-based architectures before and after Prometheus simplification.
  • Figure 3: LSTM-based architectures before and after simplification with Prometheus.
  • Figure 4: Average SHAP values across all failures for rural deployment.
  • Figure 5: Average SHAP values across all failures for urban deployments.
  • ...and 6 more figures

Theorems & Definitions (3)

  • Definition 2.1: Time Series Data
  • Definition 2.2: RLF Prediction
  • Definition 2.3: Saliency Map