Heterogeneous Multi-agent Collaboration in UAV-assisted Mobile Crowdsensing Networks
Xianyang Deng, Wenshuai Liu, Yaru FuB, Qi Zhu
TL;DR
The paper addresses joint optimization of time-slot partitioning, user-UAV association, bandwidth and computation resource allocation, and UAV trajectory in a multi-UAV mobile crowdsensing network. It formulates the problem as a non-convex stochastic optimization and reframes it as a partially observable Markov decision process (POMDP), solved via the CKAN-HAPPO MADRL framework that combines a CNN with a Kolmogorov-Arnold Network (KAN). A key innovation is the closed-form reduction of the user computation frequency and the hybrid CNN+KAN actor design with spline activations to model complex state-action dependencies efficiently. Numerical results show CKAN-HAPPO outperforms CNN-HAPPO and MLP-HAPPO baselines, with data processing increasing as more UAVs are deployed but experiencing diminishing returns due to bandwidth limits, indicating practical gains for UAV-assisted MCS.
Abstract
Unmanned aerial vehicles (UAVs)-assisted mobile crowdsensing (MCS) has emerged as a promising paradigm for data collection. However, challenges such as spectrum scarcity, device heterogeneity, and user mobility hinder efficient coordination of sensing, communication, and computation. To tackle these issues, we propose a joint optimization framework that integrates time slot partition for sensing, communication, and computation phases, resource allocation, and UAV 3D trajectory planning, aiming to maximize the amount of processed sensing data. The problem is formulated as a non-convex stochastic optimization and further modeled as a partially observable Markov decision process (POMDP) that can be solved by multi-agent deep reinforcement learning (MADRL) algorithm. To overcome the limitations of conventional multi-layer perceptron (MLP) networks, we design a novel MADRL algorithm with hybrid actor network. The newly developed method is based on heterogeneous agent proximal policy optimization (HAPPO), empowered by convolutional neural networks (CNN) for feature extraction and Kolmogorov-Arnold networks (KAN) to capture structured state-action dependencies. Extensive numerical results demonstrate that our proposed method achieves significant improvements in the amount of processed sensing data when compared with other benchmarks.
