Rivaling Transformers: Multi-Scale Structured State-Space Mixtures for Agentic 6G O-RAN
Farhad Rezazadeh, Hatim Chergui, Merouane Debbah, Houbing Song, Dusit Niyato, Lingjia Liu
TL;DR
This work tackles the challenge of prediction services for proactive, agentic control in 6G O-RAN under near-real-time latency constraints. It introduces MS$^{3}$M, a strictly causal forecaster that mixes HiPPO--LegS state-space kernels across multiple time scales via depthwise convolution, SE gating, and a compact GLU mixer to forecast next-step KPI values with high efficiency. The model delivers Transformer-competitive accuracy (RMSE ≈ 0.292 dB, MAE ≈ 0.170 dB, $R^2 ≈ 0.993$) while achieving substantially lower latency (≈0.057 s per inference) and a smaller footprint (≈0.70M parameters) on an O-RAN KPI dataset. The authors provide leakage-safe training and evaluation, a comprehensive complexity analysis, and an open-source implementation to facilitate deployment in Near-RT RIC xApps for anticipatory network control, highlighting MS$^{3}$M’s favorable accuracy–efficiency trade-off for real-time, edge-enabled KPI forecasting.
Abstract
In sixth-generation (6G) Open Radio Access Networks (O-RAN), proactive control is preferable. A key open challenge is delivering control-grade predictions within Near-Real-Time (Near-RT) latency and computational constraints under multi-timescale dynamics. We therefore cast RAN Intelligent Controller (RIC) analytics as an agentic perceive-predict xApp that turns noisy, multivariate RAN telemetry into short-horizon per-User Equipment (UE) key performance indicator (KPI) forecasts to drive anticipatory control. In this regard, Transformers are powerful for sequence learning and time-series forecasting, but they are memory-intensive, which limits Near-RT RIC use. Therefore, we need models that maintain accuracy while reducing latency and data movement. To this end, we propose a lightweight Multi-Scale Structured State-Space Mixtures (MS3M) forecaster that mixes HiPPO-LegS kernels to capture multi-timescale radio dynamics. We develop stable discrete state-space models (SSMs) via bilinear (Tustin) discretization and apply their causal impulse responses as per-feature depthwise convolutions. Squeeze-and-Excitation gating dynamically reweights KPI channels as conditions change, and a compact gated channel-mixing layer models cross-feature nonlinearities without Transformer-level cost. The model is KPI-agnostic -- Reference Signal Received Power (RSRP) serves as a canonical use case -- and is trained on sliding windows to predict the immediate next step. Empirical evaluations conducted using our bespoke O-RAN testbed KPI time-series dataset (59,441 windows across 13 KPIs). Crucially for O-RAN constraints, MS3M achieves a 0.057 s per-inference latency with 0.70M parameters, yielding 3-10x lower latency than the Transformer baselines evaluated on the same hardware, while maintaining competitive accuracy.
