Energy-Efficient Power Control for Multiple-Task Split Inference in UAVs: A Tiny Learning-Based Approach
Chenxi Zhao, Min Sheng, Junyu Liu, Tianshu Chu, Jiandong Li
TL;DR
This work tackles energy-efficient power control for multi-task split inference on UAVs under tight energy and delay constraints. It introduces a two-timescale optimization where discrete transmission-mode decisions are made by a tiny reinforcement learning (TRL) module, and continuous transmit power is optimized by an optimization-programming (OP) layer embedded between TRL outputs and rewards. The authors prove that transmission energy monotonically decreases with increasing transmission time, enabling a closed-form subproblem solution via KKT conditions and an ADMM-based restoration across multiple channel samples, using sample-average approximation to handle channel randomness. The proposed OPETRL framework achieves a higher probability of successful task completion at lower energy compared to baselines, demonstrating practical potential for energy-efficient aerial AI with limited onboard resources. This approach advances efficient UAV edge inference by efficiently coupling mode selection, transmission timing, and power control in a scalable, learning-assisted optimization pipeline.
Abstract
The limited energy and computing resources of unmanned aerial vehicles (UAVs) hinder the application of aerial artificial intelligence. The utilization of split inference in UAVs garners significant attention due to its effectiveness in mitigating computing and energy requirements. However, achieving energy-efficient split inference in UAVs remains complex considering of various crucial parameters such as energy level and delay constraints, especially involving multiple tasks. In this paper, we present a two-timescale approach for energy minimization in split inference, where discrete and continuous variables are segregated into two timescales to reduce the size of action space and computational complexity. This segregation enables the utilization of tiny reinforcement learning (TRL) for selecting discrete transmission modes for sequential tasks. Moreover, optimization programming (OP) is embedded between TRL's output and reward function to optimize the continuous transmit power. Specifically, we replace the optimization of transmit power with that of transmission time to decrease the computational complexity of OP since we reveal that energy consumption monotonically decreases with increasing transmission time. The replacement significantly reduces the feasible region and enables a fast solution according to the closed-form expression for optimal transmit power. Simulation results show that the proposed algorithm can achieve a higher probability of successful task completion with lower energy consumption.
