Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning

Geng Sun; Weilong Ma; Jiahui Li; Zemin Sun; Jiacheng Wang; Dusit Niyato; Shiwen Mao

Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning

Geng Sun, Weilong Ma, Jiahui Li, Zemin Sun, Jiacheng Wang, Dusit Niyato, Shiwen Mao

TL;DR

The paper tackles the dual objective of minimizing total task delay $f_1$ and UAV energy consumption $f_2$ in a UAV-assisted MEC system tailored for the low-altitude economy. It casts the problem as a multi-objective Markov decision process and introduces an evolutionary multi-objective DRL approach (EMODRL) with a multi-objective target distribution learning (TDL) component, plus a simulated-annealing–based scheduling (SA) to reduce action space. The proposed EMO-TDL-SA framework yields non-dominated Pareto policies and demonstrates superior convergence and trade-off performance against strong baselines in simulations. This approach enables dynamic, Pareto-aware control of UAV trajectory and offloading decisions, offering practical gains for LAE deployments with varying requirements and conditions.

Abstract

The low-altitude economy (LAE), driven by unmanned aerial vehicles (UAVs) and other aircraft, has revolutionized fields such as transportation, agriculture, and environmental monitoring. In the upcoming six-generation (6G) era, UAV-assisted mobile edge computing (MEC) is particularly crucial in challenging environments such as mountainous or disaster-stricken areas. The computation task offloading problem is one of the key issues in UAV-assisted MEC, primarily addressing the trade-off between minimizing the task delay and the energy consumption of the UAV. In this paper, we consider a UAV-assisted MEC system where the UAV carries the edge servers to facilitate task offloading for ground devices (GDs), and formulate a calculation delay and energy consumption multi-objective optimization problem (CDECMOP) to simultaneously improve the performance and reduce the cost of the system. Then, by modeling the formulated problem as a multi-objective Markov decision process (MOMDP), we propose a multi-objective deep reinforcement learning (DRL) algorithm within an evolutionary framework to dynamically adjust the weights and obtain non-dominated policies. Moreover, to ensure stable convergence and improve performance, we incorporate a target distribution learning (TDL) algorithm. Simulation results demonstrate that the proposed algorithm can better balance multiple optimization objectives and obtain superior non-dominated solutions compared to other methods.

Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning

TL;DR

The paper tackles the dual objective of minimizing total task delay

and UAV energy consumption

in a UAV-assisted MEC system tailored for the low-altitude economy. It casts the problem as a multi-objective Markov decision process and introduces an evolutionary multi-objective DRL approach (EMODRL) with a multi-objective target distribution learning (TDL) component, plus a simulated-annealing–based scheduling (SA) to reduce action space. The proposed EMO-TDL-SA framework yields non-dominated Pareto policies and demonstrates superior convergence and trade-off performance against strong baselines in simulations. This approach enables dynamic, Pareto-aware control of UAV trajectory and offloading decisions, offering practical gains for LAE deployments with varying requirements and conditions.

Abstract

Paper Structure (25 sections, 29 equations, 8 figures, 2 tables, 5 algorithms)

This paper contains 25 sections, 29 equations, 8 figures, 2 tables, 5 algorithms.

Introduction
Related Work
Models and Preliminaries
System Overview
Task Model
Communication Model
Computational Model
UAV Movement Model
Problem Formulation
EMODRL-based Approach
MOMDP Simplification and Formulation
Action Simplification
MOMDP Formulation
The Proposed EMO-TDL-SA
Evolutionary Multi-objective Optimization Framework Overview
...and 10 more sections

Figures (8)

Figure 1: A schematic diagram of the system for task computation assisted by a UAV equipped with an edge server.
Figure 2: EMO-TDL-SA framework
Figure 3: TDL Framework
Figure 4: Convergence curves of various DRL algorithms. (a) Comparative convergence results of EMO-PPO-SA, EMO-TDL-SA, ETD3, and ESAC algorithms (highlighting the general convergence trends and performance differences). (b) Comparative convergence results of EMO-PPO-SA, EMO-TDL-SA, ETD3, and ESAC algorithms (highlighting the initial convergence in early training episodes).
Figure 5: Convergence trend for EMO-TDL with FCFS, SJF, SA, and PS scheduling strategies.
...and 3 more figures

Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning

TL;DR

Abstract

Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (8)