Rethinking Telemetry Design for Fine-Grained Anomaly Detection in 5G User Planes
Niloy Saha, Noura Limam, Yang Xiao, Raouf Boutaba
TL;DR
Rethinking Telemetry Design for Fine-Grained Anomaly Detection in 5G User Planes addresses the visibility gap in UPF telemetry between coarse counters and expensive per-packet postcards. To achieve fine-grained, low-overhead visibility, the authors extend Count-Min Sketch with histogram-augmented buckets and per-queue partitioning, capturing latency tails and inter-arrival distributions without maintaining per-flow state. They derive formal detectability guarantees accounting for sketch collisions and drift, and provide practical sizing rules (e.g., $w=512$, $d=3$) and binning strategies. Evaluations on a 5G UPF testbed show Kestrel delivers high detection accuracy with sub-second responsiveness and a bounded export cost, achieving roughly 10x bandwidth reduction and about 10% accuracy improvement over selective postcard schemes, demonstrating the approach's practicality for next-generation mobile networks.
Abstract
Detecting QoS anomalies in 5G user planes requires fine-grained per-flow visibility, but existing telemetry approaches face a fundamental trade-off. Coarse per-class counters are lightweight but mask transient and per-flow anomalies, while per-packet telemetry postcards provide full visibility at prohibitive cost that grows linearly with line rate. Selective postcard schemes reduce overhead but miss anomalies that fall below configured thresholds or occur during brief intervals. We present Kestrel, a sketch-based telemetry system for 5G user planes that provides fine-grained visibility into key metric distributions such as latency tails and inter-arrival times at a fraction of the cost of per-packet postcards. Kestrel extends Count-Min Sketch with histogram-augmented buckets and per-queue partitioning, which compress per-packet measurements into compact summaries while preserving anomaly-relevant signals. We develop formal detectability guarantees that account for sketch collisions, yielding principled sizing rules and binning strategies that maximize anomaly separability. Our evaluations on a 5G testbed with Intel Tofino switches show that Kestrel achieves 10% better detection accuracy than existing selective postcard schemes while reducing export bandwidth by 10x.
