Table of Contents
Fetching ...

Non-iterative Optimization of Trajectory and Radio Resource for Aerial Network

Hyeonsu Lyu, Jonggyu Jang, Harim Lee, Hyun Jong Yang

TL;DR

This work tackles joint trajectory planning, user association, resource allocation, and power control for an aerial IoT network with the goal of proportional fairness under end-to-end QoS. It exposes the drawbacks of traditional coordinate optimization and the curse of initialization, then introduces a non-iterative framework that reformulates the problem as an MDP with temporal decoupling, enabling separate, per-slot RRM optimization. A generalized water-filling approach with Lagrangian/KKT techniques yields an efficient per-slot solver, while GA, DFS, and DQN-based trajectory planning solve the resulting MDP with strong empirical performance that nearly reaches a global optimum. The method improves fairness, increases the number of served devices, and remains robust across bandwidth, QoS, and network size, offering practical benefits for deployment of UAV-based aerial IoT systems.

Abstract

We address a joint trajectory planning, user association, resource allocation, and power control problem to maximize proportional fairness in the aerial IoT network, considering practical end-to-end quality-of-service (QoS) and communication schedules. Though the problem is rather ancient, apart from the fact that the previous approaches have never considered user- and time-specific QoS, we point out a prevalent mistake in coordinate optimization approaches adopted by the majority of the literature. Coordinate optimization approaches, which repetitively optimize radio resources for a fixed trajectory and vice versa, generally converge to local optima when all variables are differentiable. However, these methods often stagnate at a non-stationary point, significantly degrading the network utility in mixed-integer problems such as joint trajectory and radio resource optimization. We detour this problem by converting the formulated problem into the Markov decision process (MDP). Exploiting the beneficial characteristics of the MDP, we design a non-iterative framework that cooperatively optimizes trajectory and radio resources without initial trajectory choice. The proposed framework can incorporate various trajectory-planning algorithms such as the genetic algorithm, tree search, and reinforcement learning. Extensive comparisons with diverse baselines verify that the proposed framework significantly outperforms the state-of-the-art method, nearly achieving the global optimum. Our implementation code is available at https://github.com/hslyu/dbspf.{https://github.com/hslyu/dbspf}.

Non-iterative Optimization of Trajectory and Radio Resource for Aerial Network

TL;DR

This work tackles joint trajectory planning, user association, resource allocation, and power control for an aerial IoT network with the goal of proportional fairness under end-to-end QoS. It exposes the drawbacks of traditional coordinate optimization and the curse of initialization, then introduces a non-iterative framework that reformulates the problem as an MDP with temporal decoupling, enabling separate, per-slot RRM optimization. A generalized water-filling approach with Lagrangian/KKT techniques yields an efficient per-slot solver, while GA, DFS, and DQN-based trajectory planning solve the resulting MDP with strong empirical performance that nearly reaches a global optimum. The method improves fairness, increases the number of served devices, and remains robust across bandwidth, QoS, and network size, offering practical benefits for deployment of UAV-based aerial IoT systems.

Abstract

We address a joint trajectory planning, user association, resource allocation, and power control problem to maximize proportional fairness in the aerial IoT network, considering practical end-to-end quality-of-service (QoS) and communication schedules. Though the problem is rather ancient, apart from the fact that the previous approaches have never considered user- and time-specific QoS, we point out a prevalent mistake in coordinate optimization approaches adopted by the majority of the literature. Coordinate optimization approaches, which repetitively optimize radio resources for a fixed trajectory and vice versa, generally converge to local optima when all variables are differentiable. However, these methods often stagnate at a non-stationary point, significantly degrading the network utility in mixed-integer problems such as joint trajectory and radio resource optimization. We detour this problem by converting the formulated problem into the Markov decision process (MDP). Exploiting the beneficial characteristics of the MDP, we design a non-iterative framework that cooperatively optimizes trajectory and radio resources without initial trajectory choice. The proposed framework can incorporate various trajectory-planning algorithms such as the genetic algorithm, tree search, and reinforcement learning. Extensive comparisons with diverse baselines verify that the proposed framework significantly outperforms the state-of-the-art method, nearly achieving the global optimum. Our implementation code is available at https://github.com/hslyu/dbspf.{https://github.com/hslyu/dbspf}.
Paper Structure (37 sections, 1 theorem, 50 equations, 18 figures, 5 tables, 3 algorithms)

This paper contains 37 sections, 1 theorem, 50 equations, 18 figures, 5 tables, 3 algorithms.

Key Result

Proposition 1

The following inequality holds: where $R_i^{(0)}=1$ for all $i\in\mathcal{I}$, $\mathbf{A}^{(t)} = \{i \in \mathcal{I} | \alpha_i^{(t)}=1\}$, $\mathbf{B}^{(t)} = [\beta_i^{(t)}]_{i\in \mathcal{I}}$, and $\mathbf{P}^{(t)} = [\rho_i^{(t)}]_{i\in \mathcal{I}}$.

Figures (18)

  • Figure 1: An illustrative failure scenario in coordinate optimization. Stages 0-2 demonstrate how coordinate optimization approaches converge to the non-stationary point. Once the initial $q$ is chosen (Stage 0), the following $a$ is accordingly determined (Stage 1). Then, the next iteration fails to find the optimal $q$ (Stage 2). Meanwhile, our approach adopts a hierarchical optimization that sweeps the entire variable space.
  • Figure 2: Motivated illustration of our contribution. Graphs (a) and (b) represent iterative optimization (GA-ITER), and graph (c) represents the proposed non-iterative approach (GA-TP). The two methods share the same environment configurations and algorithms except for the optimization order. While GA-ITER optimizes radio resources for a given initial trajectory, GA-TP adaptively optimizes radio resources according to the changing trajectory. Note that the initialization in (a) limits the changes of the updated trajectory, leading to a significant difference between the resulting trajectories in (b) and (c).
  • Figure 3: Illustration of the system model at time $t=20$. The dark gray marks represent users whose request periods have already expired. The green marks with dashed links represent users who are being serviced at the current time. The orange marks represent users whose request periods do not expire, but that are not currently being served. Users who have not expired may not be served if their request period does not start. The UAV-BS selects service users by jointly optimizing TP, UA, RA, and PC.
  • Figure 4: Illustration of the objective in problem $\bf{\mathdutchcal{P}2}$ in the viewpoint of MDP. For each time slot $t$, state is given as all variables before $t$; action is $\mathbf{q}^{(t)}$; and reward is $f(\mathbf{q}^{(t)})$.
  • Figure 5: Visualization of the water-filling solution $\beta_i^{(t)}$ in \ref{['eq:beta_given_alpha']} for 7 users. (a) The dashed line implies the water level. (b) The blue column indicates the minimum RA requirement $r_i/e_i^{(t)}$. (c) The grey column represents the given ground heights $1/w_i^{(t)}=\sum_{k=0}^{t-1}R_i^{(k)}/e_i^{(t)}$, which implies that the UAV-BS allocates a low bandwidth proportion to a user if the user has a high total received data and low spectral efficiency.
  • ...and 13 more figures

Theorems & Definitions (1)

  • Proposition 1: Lower bound of Problem $\bf{\mathdutchcal{P}1}$