An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks

Khaleda Papry; Francesco Spinnato; Marco Fiore; Mirco Nanni; Israat Haque

An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks

Khaleda Papry, Francesco Spinnato, Marco Fiore, Mirco Nanni, Israat Haque

TL;DR

This work introduces Prometheus, an explainable framework for 5G RAN radio link failure prediction that combines a model-agnostic SHAP-based explainer with an explainability-guided simplification loop. By applying local/global aggregation of SHAP contributions, Prometheus identifies the most informative features and prunes both input features and network architecture, producing lightweight predictors. On real rural and urban Turkcell datasets, the framework yields substantial parameter reductions (e.g., from $13.4\mathrm{K}$ to $5.8\mathrm{K}$ for GenTrap-derived models, and from $154.7\mathrm{K}$ to $11\mathrm{K}$ for LSTM+) while achieving equal or improved F1-scores, with weather context contributing minimally to predictions. The results highlight that high predictive performance can be achieved with a reduced feature set focused on RL KPIs, improving interpretability, scalability, and deployability for 5G RAN operations. Future work will explore longer time-series, counterfactual explanations, and broader applicability across network management tasks.

Abstract

As 5G networks continue to evolve to deliver high speed, low latency, and reliable communications, ensuring uninterrupted service has become increasingly critical. While millimeter wave (mmWave) frequencies enable gigabit data rates, they are highly susceptible to environmental factors, often leading to radio link failures (RLF). Predictive models leveraging radio and weather data have been proposed to address this issue; however, many operate as black boxes, offering limited transparency for operational deployment. This work bridges that gap by introducing a framework that combines explainability based feature pruning with model refinement. Our framework can be integrated into state of the art predictors such as GNN Transformer and LSTM based architectures for RLF prediction, enabling the development of accurate and explainability guided models in 5G networks. It provides insights into the contribution of input features and the decision making logic of neural networks, leading to lighter and more scalable models. When applied to RLF prediction, our framework unveils that weather data contributes minimally to the forecast in extensive real world datasets, which informs the design of a leaner model with 50 percent fewer parameters and improved F1 scores with respect to the state of the art solution. Ultimately, this work empowers network providers to evaluate and refine their neural network based prediction models for better interpretability, scalability, and performance.

An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks

TL;DR

for GenTrap-derived models, and from

for LSTM+) while achieving equal or improved F1-scores, with weather context contributing minimally to predictions. The results highlight that high predictive performance can be achieved with a reduced feature set focused on RL KPIs, improving interpretability, scalability, and deployability for 5G RAN operations. Future work will explore longer time-series, counterfactual explanations, and broader applicability across network management tasks.

Abstract

Paper Structure (15 sections, 3 equations, 11 figures, 5 tables)

This paper contains 15 sections, 3 equations, 11 figures, 5 tables.

Introduction
Background
Related Work
Workflow and Methodology
Model-agnostic Explainer Module
Local Aggregation (LA) and Global Aggregation (GA)
Feature Pruning
Informed simplification of DNNs
Transformer-based Model
LSTM-based Model
Results and Discussion
Does RLF prediction require categorical features and weather context? (RQ1)
Are lightweight models comparable to the black-box models? (RQ2)
Which features contribute to the RLF prediction, and why? (RQ3)
Conclusion

Figures (11)

Figure 1: Prometheus architecture with three components: a black-box DNN, an explainer, and an explainability-guided simplified DNN. The explainer includes four modules: model-agnostic explainer, Local Aggregation (LA), Global Aggregation (GA), and feature pruning. The output is a simplified DNN with reduced components.
Figure 2: Transformer-based architectures before and after Prometheus simplification.
Figure 3: LSTM-based architectures before and after simplification with Prometheus.
Figure 4: Average SHAP values across all failures for rural deployment.
Figure 5: Average SHAP values across all failures for urban deployments.
...and 6 more figures

Theorems & Definitions (3)

Definition 2.1: Time Series Data
Definition 2.2: RLF Prediction
Definition 2.3: Saliency Map

An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks

TL;DR

Abstract

An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (11)

Theorems & Definitions (3)