Carbon-aware decentralized dynamic task offloading in MIMO-MEC networks via multi-agent reinforcement learning

Mubshra Zulfiqar; Muhammad Ayzed Mirza; Basit Qureshi

Carbon-aware decentralized dynamic task offloading in MIMO-MEC networks via multi-agent reinforcement learning

Mubshra Zulfiqar, Muhammad Ayzed Mirza, Basit Qureshi

TL;DR

Experimental results demonstrate CADDTO-PPO outperforms deep deterministic policy gradient (DDPG) and lyapunov-based baselines, the framework achieves the lowest carbon intensity and maintains near-zero packet overflow rates under extreme traffic loads.

Abstract

Massive internet of things microservices require integrating renewable energy harvesting into mobile edge computing (MEC) for sustainable eScience infrastructures. Spatiotemporal mismatches between stochastic task arrivals and intermittent green energy along with complex inter-user interference in multi-antenna (MIMO) uplinks complicate real-time resource management. Traditional centralized optimization and off-policy reinforcement learning struggle with scalability and signaling overhead in dense networks. This paper proposes CADDTO-PPO, a carbon-aware decentralized dynamic task offloading framework based on multi-agent proximal policy optimization. The multi-user MIMO-MEC system is modeled as a Decentralized Partially Observable Markov Decision Process (DEC-POMDP) to jointly minimize carbon emissions and buffer latency and energy wastage. A scalable architecture utilizes decentralized execution with parameter sharing (DEPS), which enables autonomous IoT agents to make fine-grained power control and offloading decisions based solely on local observations. Additionally, a carbon-first reward structure adaptively prioritizes green time slots for data transmission to decouple system throughput from grid-dependent carbon footprints. Finally, experimental results demonstrate CADDTO-PPO outperforms deep deterministic policy gradient (DDPG) and lyapunov-based baselines. The framework achieves the lowest carbon intensity and maintains near-zero packet overflow rates under extreme traffic loads. Architectural profiling validates the framework to demonstrate a constant $O(1)$ inference complexity and theoretical lightweight feasibility for future generation sustainable IoT deployments.

Carbon-aware decentralized dynamic task offloading in MIMO-MEC networks via multi-agent reinforcement learning

TL;DR

Abstract

inference complexity and theoretical lightweight feasibility for future generation sustainable IoT deployments.

Paper Structure (28 sections, 27 equations, 12 figures, 4 tables, 1 algorithm)

This paper contains 28 sections, 27 equations, 12 figures, 4 tables, 1 algorithm.

Introduction
Motivation and contribution
Related work
Carbon-aware computing and energy harvesting in MEC
MIMO-enabled edge computing and interference management
Decentralized DRL for task offloading
System model
Network architecture
Local computing
Edge computing model
Communication model
Computation model
Carbon emission model
Energy wastage model
Problem formulation
...and 13 more sections

Figures (12)

Figure 1: Framework of the carbon-aware MIMO-MEC system
Figure 2: System architecture of the multi-user uplink MIMO-MEC system. The BS monitors SINR and provides feedback to decentralized agents for autonomous power control.
Figure 3: Multi-agent PPO architecture with centralized training decentralized execution using shared parameters
Figure 4: Schematic representation of the local observation and decentralized decision-making process of user agent at slot $t$
Figure 5: Training convergence of CADDTO-PPO and Centralized-PPO.
...and 7 more figures

Carbon-aware decentralized dynamic task offloading in MIMO-MEC networks via multi-agent reinforcement learning

TL;DR

Abstract

Carbon-aware decentralized dynamic task offloading in MIMO-MEC networks via multi-agent reinforcement learning

Authors

TL;DR

Abstract

Table of Contents

Figures (12)