Table of Contents
Fetching ...

Packet-Level Traffic Modeling with Heavy-Tailed Payload and Inter-Arrival Distributions for Digital Twins

Enes Koktas, Peter Rost

TL;DR

The study tackles the need for packet-level traffic generation in digital twins by proposing a compact hybrid generator that combines a small HMM for state dynamics with a conditional MDN using diagonal Student-t mixtures to model joint payload and IAT within each state. It introduces a tail-aware idle state to handle long idle gaps and trains the MDN with flow-length conditioning, achieving realistic distributions and temporal structure with a footprint around 0.16 MB. Across four traces, the method closely matches marginal distributions, autocorrelation, and flow-descriptor diversity, outperforming several neural and HMM baselines in most cases. This approach enables accurate, low-overhead traffic generation suitable for edge deployment and rapid recalibration in dynamic RAN digital twins.

Abstract

Digital twins of radio access networks require packet-level traffic generators that reproduce the size and timing of packets while remaining compact and easy to recalibrate as traffic changes. We address this need with a hybrid generator that combines a small hidden Markov model, which captures buffering, streaming, and idle states, with a mixture density network that models the joint distribution of payload length and inter-arrival time (IAT) in each state using Student-t mixtures. The state space and emission family are designed to handle heavy-tailed IAT by anchoring an explicit idle state in the tail and allowing each component to adapt its tail thickness. We evaluate the model on public traces of web, smart home, and encrypted media traffic and compare it with recent neural network and transformer based generators as well as hidden Markov baselines. Across most datasets and metrics, including average per-flow cumulative distribution functions, autocorrelation based measures of temporal structure, and Wasserstein distances between flow descriptors, the proposed generator matches the real traffic most closely in the majority of cases while using orders of magnitude fewer parameters. The full model occupies around 0.2 MB in our experiments, which makes it suitable for deployment inside digital twins where memory footprint and low-overhead adaptation are critical.

Packet-Level Traffic Modeling with Heavy-Tailed Payload and Inter-Arrival Distributions for Digital Twins

TL;DR

The study tackles the need for packet-level traffic generation in digital twins by proposing a compact hybrid generator that combines a small HMM for state dynamics with a conditional MDN using diagonal Student-t mixtures to model joint payload and IAT within each state. It introduces a tail-aware idle state to handle long idle gaps and trains the MDN with flow-length conditioning, achieving realistic distributions and temporal structure with a footprint around 0.16 MB. Across four traces, the method closely matches marginal distributions, autocorrelation, and flow-descriptor diversity, outperforming several neural and HMM baselines in most cases. This approach enables accurate, low-overhead traffic generation suitable for edge deployment and rapid recalibration in dynamic RAN digital twins.

Abstract

Digital twins of radio access networks require packet-level traffic generators that reproduce the size and timing of packets while remaining compact and easy to recalibrate as traffic changes. We address this need with a hybrid generator that combines a small hidden Markov model, which captures buffering, streaming, and idle states, with a mixture density network that models the joint distribution of payload length and inter-arrival time (IAT) in each state using Student-t mixtures. The state space and emission family are designed to handle heavy-tailed IAT by anchoring an explicit idle state in the tail and allowing each component to adapt its tail thickness. We evaluate the model on public traces of web, smart home, and encrypted media traffic and compare it with recent neural network and transformer based generators as well as hidden Markov baselines. Across most datasets and metrics, including average per-flow cumulative distribution functions, autocorrelation based measures of temporal structure, and Wasserstein distances between flow descriptors, the proposed generator matches the real traffic most closely in the majority of cases while using orders of magnitude fewer parameters. The full model occupies around 0.2 MB in our experiments, which makes it suitable for deployment inside digital twins where memory footprint and low-overhead adaptation are critical.

Paper Structure

This paper contains 11 sections, 34 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Training and generation pipeline of the proposed traffic generator.
  • Figure 2: Average per-flow CDF comparison of payload length and IAT for HTTP traffic.
  • Figure 3: Average per-flow CDF comparison of payload length and IAT for UDP traffic.
  • Figure 4: Average per-flow CDF comparison of payload length and IAT for Facebook audio traffic.
  • Figure 5: Average per-flow CDF comparison of payload length and IAT for Facebook video traffic.