CALF: Communication-Aware Learning Framework for Distributed Reinforcement Learning

Carlos Purves; Pietro Lio'

CALF: Communication-Aware Learning Framework for Distributed Reinforcement Learning

Carlos Purves, Pietro Lio'

Abstract

Distributed reinforcement learning policies face network delays, jitter, and packet loss when deployed across edge devices and cloud servers. Standard RL training assumes zero-latency interaction, causing severe performance degradation under realistic network conditions. We introduce CALF (Communication-Aware Learning Framework), which trains policies under realistic network models during simulation. Systematic experiments demonstrate that network-aware training substantially reduces deployment performance gaps compared to network-agnostic baselines. Distributed policy deployments across heterogeneous hardware validate that explicitly modelling communication constraints during training enables robust real-world execution. These findings establish network conditions as a major axis of sim-to-real transfer for Wi-Fi-like distributed deployments, complementing physics and visual domain randomisation.

CALF: Communication-Aware Learning Framework for Distributed Reinforcement Learning

Abstract

Paper Structure (39 sections, 3 figures, 9 tables)

This paper contains 39 sections, 3 figures, 9 tables.

Introduction
Background and Positioning
Delays, Packet Loss, and What RL Typically Assumes
Sim-to-Real Transfer and Distributed RL: The Missing Network Axis
Edge, Multi-Agent, and Other Network-Aware ML Contexts
CALF: A Framework for Network-Aware Reinforcement Learning
Design Goals
Architecture Overview
NetworkShim: The Core Mechanism
Progressive Deployment Modes
Summary
Network-Aware Training Methodology
Problem Formulation: Delayed MDPs
Training Regimes: Comparing Network-Awareness
RL Algorithm: PPO
...and 24 more sections

Figures (3)

Figure 1: Real-world distributed systems employ hybrid communication strategies to maintain operation under varying network conditions. Unmanned aerial vehicles, for instance, choose between multiple communication channels (satellite, radio frequency, optical tether) based on signal quality and operational constraints, defaulting to autonomous operation when no reliable connection exists. This illustrates the challenge CALF addresses: policies must function across heterogeneous network conditions rather than assuming perfect connectivity. We focus on LAN-like scenarios (Wi-Fi, Ethernet); WAN/adversarial scenarios motivate the need for configurable impairments but are not evaluated here.
Figure 2: CALF's three progressive deployment modes enable incremental validation from pure simulation to distributed deployment. Mode 1 (Local Sim) provides a zero-latency baseline for rapid development with environment and policy co-located. Mode 2 (Sim + Simulated Network) introduces NetworkShim services that inject realistic latency, jitter, and packet loss for network-aware training. Mode 3 (Edge Sim) validates distributed deployment on real hardware (Raspberry Pi for environment, Desktop for policy) communicating over real Wi-Fi/Ethernet networks. This progressive approach ensures that network-aware policies trained in Mode 2 transfer successfully to distributed edge deployment in Mode 3, addressing the network axis of the sim-to-real gap.
Figure 3: CartPole: Performance comparison across deployment modes for each training regime. Full network-aware training maintains robust performance under real network conditions, while baseline training exhibits severe degradation. Delay-only training provides partial robustness, demonstrating the necessity of modelling jitter and packet loss in addition to latency. Error bars represent $\pm$1 std across 10 seeds.

CALF: Communication-Aware Learning Framework for Distributed Reinforcement Learning

Abstract

CALF: Communication-Aware Learning Framework for Distributed Reinforcement Learning

Authors

Abstract

Table of Contents

Figures (3)