MUSE-FM: Multi-task Environment-aware Foundation Model for Wireless Communications
Tianyue Zheng, Jiajia Guo, Linglong Dai, Shi Jin, Jun Zhang
TL;DR
This work addresses the challenge of unifying multiple wireless transceiver tasks under a single foundation model, with emphasis on environment-aware, cross-scenario generalization. It introduces MUSE-FM, a unified framework that combines a scene-graph–based environment encoder, a hypernetwork-driven task instruction module, and a prompt-guided unified data encoder-decoder paired with a Transformer backbone to handle channel estimation, MIMO detection, precoding, decoding, and localization. Key contributions include the integration of environmental priors via scene graphs, a scalable prompt-guided data processing pipeline, and a multi-task, multi-scenario dataset used to demonstrate improved performance, robustness at low $SNR$, and strong few-shot and zero-shot capacities. The results show that MUSE-FM outperforms baselines across tasks, enables cross-scenario learning, and offers lower memory and latency overhead than deploying separate task-specific models, enabling practical, scalable wireless intelligence. This framework paves the way for end-to-end transceiver optimization, leveraging cross-task and cross-environment knowledge to enhance real-world 6G and beyond deployments.
Abstract
Recent advancements in foundation models (FMs) have attracted increasing attention in the wireless communication domain. Leveraging the powerful multi-task learning capability, FMs hold the promise of unifying multiple tasks of wireless communication with a single framework. Nevertheless, existing wireless FMs face limitations in the uniformity to address multiple tasks with diverse inputs/outputs across different communication scenarios. In this paper, we propose a MUlti-taSk Environment-aware FM (MUSE-FM) with a unified architecture to handle multiple tasks in wireless communications, while effectively incorporating scenario information. Specifically, to achieve task uniformity, we propose a unified prompt-guided data encoder-decoder pair to handle data with heterogeneous formats and distributions across different tasks. Besides, we integrate the environmental context as a multi-modal input, which serves as prior knowledge of environment and channel distributions and facilitates cross-scenario feature extraction. Simulation results illustrate that the proposed MUSE-FM outperforms existing methods for various tasks, and its prompt-guided encoder-decoder pair facilitates few-shot adaptation to new task configurations. Moreover, the incorporation of environment information improves the ability to adapt to different scenarios.
