Table of Contents
Fetching ...

Multimodal Mixture-of-Experts for ISAC in Low-Altitude Wireless Networks

Kai Zhang, Wentao Yu, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

TL;DR

This work tackles the challenge of robust, low-latency ISAC in low-altitude wireless networks by introducing a multimodal mixture-of-experts framework that adaptively weights modality-specific experts through a light gating network. A sparse variant further reduces energy and computation via top-N expert activation with straight-through gradient routing, maintaining performance while saving resources on UAVs. Across three representative ISAC tasks—sensing-aided beam prediction, sensing-aided path loss prediction, and communication-aided UAV trajectory tracking—the MoE models consistently outperform static fusion and monolithic baselines, with improved training efficiency. The results demonstrate that adaptive, modality-aware fusion is crucial for reliable perception and connectivity in dynamic aerial environments, enabling practical deployment in LAWNs.

Abstract

Integrated sensing and communication (ISAC) is a key enabler for low-altitude wireless networks (LAWNs), providing simultaneous environmental perception and data transmission in complex aerial scenarios. By combining heterogeneous sensing modalities such as visual, radar, lidar, and positional information, multimodal ISAC can improve both situational awareness and robustness of LAWNs. However, most existing multimodal fusion approaches use static fusion strategies that treat all modalities equally and cannot adapt to channel heterogeneity or time-varying modality reliability in dynamic low-altitude environments. To address this fundamental limitation, we propose a mixture-of-experts (MoE) framework for multimodal ISAC in LAWNs. Each modality is processed by a dedicated expert network, and a lightweight gating module adaptively assigns fusion weights according to the instantaneous informativeness and reliability of each modality. To improve scalability under the stringent energy constraints of aerial platforms, we further develop a sparse MoE variant that selectively activates only a subset of experts, thereby reducing computation overhead while preserving the benefits of adaptive fusion. Comprehensive simulations on three typical ISAC tasks in LAWNs demonstrate that the proposed frameworks consistently outperform conventional multimodal fusion baselines in terms of learning performance and training sample efficiency.

Multimodal Mixture-of-Experts for ISAC in Low-Altitude Wireless Networks

TL;DR

This work tackles the challenge of robust, low-latency ISAC in low-altitude wireless networks by introducing a multimodal mixture-of-experts framework that adaptively weights modality-specific experts through a light gating network. A sparse variant further reduces energy and computation via top-N expert activation with straight-through gradient routing, maintaining performance while saving resources on UAVs. Across three representative ISAC tasks—sensing-aided beam prediction, sensing-aided path loss prediction, and communication-aided UAV trajectory tracking—the MoE models consistently outperform static fusion and monolithic baselines, with improved training efficiency. The results demonstrate that adaptive, modality-aware fusion is crucial for reliable perception and connectivity in dynamic aerial environments, enabling practical deployment in LAWNs.

Abstract

Integrated sensing and communication (ISAC) is a key enabler for low-altitude wireless networks (LAWNs), providing simultaneous environmental perception and data transmission in complex aerial scenarios. By combining heterogeneous sensing modalities such as visual, radar, lidar, and positional information, multimodal ISAC can improve both situational awareness and robustness of LAWNs. However, most existing multimodal fusion approaches use static fusion strategies that treat all modalities equally and cannot adapt to channel heterogeneity or time-varying modality reliability in dynamic low-altitude environments. To address this fundamental limitation, we propose a mixture-of-experts (MoE) framework for multimodal ISAC in LAWNs. Each modality is processed by a dedicated expert network, and a lightweight gating module adaptively assigns fusion weights according to the instantaneous informativeness and reliability of each modality. To improve scalability under the stringent energy constraints of aerial platforms, we further develop a sparse MoE variant that selectively activates only a subset of experts, thereby reducing computation overhead while preserving the benefits of adaptive fusion. Comprehensive simulations on three typical ISAC tasks in LAWNs demonstrate that the proposed frameworks consistently outperform conventional multimodal fusion baselines in terms of learning performance and training sample efficiency.

Paper Structure

This paper contains 28 sections, 35 equations, 8 figures, 1 algorithm.

Figures (8)

  • Figure 1: An illustration of multimodal ISAC in LAWNs, where a ground BS and UAVs exploit heterogeneous sensing data to support sensing-aided beam prediction, sensing-aided channel estimation, and communication-aided UAV trajectory tracking.
  • Figure 2: Architecture of the proposed multimodal MoE framework. Modality-specific experts extract features from heterogeneous inputs, which are aggregated via weighted summation using coefficients from a gating network to form a fused representation for the prediction head.
  • Figure 3: Top-1 beam prediction accuracy across different combinations of sensing modalities.
  • Figure 4: Illustration of a representative scene from the multimodal ISAC dataset in LAWNs, showing the environment from the ground BS and UAV perspectives. UAVs are marked in red boxes.
  • Figure 5: Comparison of Top-1 beam prediction accuracy for the proposed multimodal MoE frameworks and benchmark methods.
  • ...and 3 more figures