Table of Contents
Fetching ...

Carbon-aware decentralized dynamic task offloading in MIMO-MEC networks via multi-agent reinforcement learning

Mubshra Zulfiqar, Muhammad Ayzed Mirza, Basit Qureshi

TL;DR

Experimental results demonstrate CADDTO-PPO outperforms deep deterministic policy gradient (DDPG) and lyapunov-based baselines, the framework achieves the lowest carbon intensity and maintains near-zero packet overflow rates under extreme traffic loads.

Abstract

Massive internet of things microservices require integrating renewable energy harvesting into mobile edge computing (MEC) for sustainable eScience infrastructures. Spatiotemporal mismatches between stochastic task arrivals and intermittent green energy along with complex inter-user interference in multi-antenna (MIMO) uplinks complicate real-time resource management. Traditional centralized optimization and off-policy reinforcement learning struggle with scalability and signaling overhead in dense networks. This paper proposes CADDTO-PPO, a carbon-aware decentralized dynamic task offloading framework based on multi-agent proximal policy optimization. The multi-user MIMO-MEC system is modeled as a Decentralized Partially Observable Markov Decision Process (DEC-POMDP) to jointly minimize carbon emissions and buffer latency and energy wastage. A scalable architecture utilizes decentralized execution with parameter sharing (DEPS), which enables autonomous IoT agents to make fine-grained power control and offloading decisions based solely on local observations. Additionally, a carbon-first reward structure adaptively prioritizes green time slots for data transmission to decouple system throughput from grid-dependent carbon footprints. Finally, experimental results demonstrate CADDTO-PPO outperforms deep deterministic policy gradient (DDPG) and lyapunov-based baselines. The framework achieves the lowest carbon intensity and maintains near-zero packet overflow rates under extreme traffic loads. Architectural profiling validates the framework to demonstrate a constant $O(1)$ inference complexity and theoretical lightweight feasibility for future generation sustainable IoT deployments.

Carbon-aware decentralized dynamic task offloading in MIMO-MEC networks via multi-agent reinforcement learning

TL;DR

Experimental results demonstrate CADDTO-PPO outperforms deep deterministic policy gradient (DDPG) and lyapunov-based baselines, the framework achieves the lowest carbon intensity and maintains near-zero packet overflow rates under extreme traffic loads.

Abstract

Massive internet of things microservices require integrating renewable energy harvesting into mobile edge computing (MEC) for sustainable eScience infrastructures. Spatiotemporal mismatches between stochastic task arrivals and intermittent green energy along with complex inter-user interference in multi-antenna (MIMO) uplinks complicate real-time resource management. Traditional centralized optimization and off-policy reinforcement learning struggle with scalability and signaling overhead in dense networks. This paper proposes CADDTO-PPO, a carbon-aware decentralized dynamic task offloading framework based on multi-agent proximal policy optimization. The multi-user MIMO-MEC system is modeled as a Decentralized Partially Observable Markov Decision Process (DEC-POMDP) to jointly minimize carbon emissions and buffer latency and energy wastage. A scalable architecture utilizes decentralized execution with parameter sharing (DEPS), which enables autonomous IoT agents to make fine-grained power control and offloading decisions based solely on local observations. Additionally, a carbon-first reward structure adaptively prioritizes green time slots for data transmission to decouple system throughput from grid-dependent carbon footprints. Finally, experimental results demonstrate CADDTO-PPO outperforms deep deterministic policy gradient (DDPG) and lyapunov-based baselines. The framework achieves the lowest carbon intensity and maintains near-zero packet overflow rates under extreme traffic loads. Architectural profiling validates the framework to demonstrate a constant inference complexity and theoretical lightweight feasibility for future generation sustainable IoT deployments.
Paper Structure (28 sections, 27 equations, 12 figures, 4 tables, 1 algorithm)

This paper contains 28 sections, 27 equations, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: Framework of the carbon-aware MIMO-MEC system
  • Figure 2: System architecture of the multi-user uplink MIMO-MEC system. The BS monitors SINR and provides feedback to decentralized agents for autonomous power control.
  • Figure 3: Multi-agent PPO architecture with centralized training decentralized execution using shared parameters
  • Figure 4: Schematic representation of the local observation and decentralized decision-making process of user agent at slot $t$
  • Figure 5: Training convergence of CADDTO-PPO and Centralized-PPO.
  • ...and 7 more figures