Table of Contents
Fetching ...

Multimodal-Wireless: A Large-Scale Dataset for Sensing and Communication

Tianhao Mao, Le Liang, Jie Yang, Hao Ye, Shi Jin, Geoffrey Ye Li

TL;DR

The paper tackles the lack of large-scale multimodal datasets that jointly capture sensing data and wireless channel information for context-aware communications and collaborative perception. It presents Multimodal-Wireless, a scalable, open-source dataset produced via an integrated pipeline using CARLA, Blender, and Sionna to synchronize LiDAR, RGB/depth, IMU, radar with high-fidelity CSI at 100 Hz under varied weather. Key contributions include a configurable data-generation framework, comprehensive vehicle-to-everything channel data with LOS and first-order reflections, weather diversity, and a Python utility to synthesize frequency-domain channels from path data using $H(f_k)=\sum_{m=1}^{M} A_m e^{-j2\pi f_k \tau_m}$. A case study demonstrates multimodal LLM-based beam prediction, highlighting the value of combining environmental sensing with CSI for improved performance, and the dataset enables broad research in V2X and collaborative perception.

Abstract

This paper presents Multimodal-Wireless, a large-scale open-source dataset for multimodal sensing and communication research. The dataset is generated through an integrated and customizable data pipeline built upon the CARLA simulator and Sionna framework, and features high-resolution communication channel state information (CSI) fully synchronized with five other sensor modalities, namely LiDAR, RGB and depth camera, inertial measurement unit (IMU) and radar, all sampled at 100 Hz. It contains approximately 160,000 frames collected across four virtual towns, sixteen communication scenarios, and three weather conditions. This paper provides a comprehensive overview of the dataset, outlining its key features, overall framework, and technical implementation details. In addition, it explores potential research applications concerning communication and collaborative perception, exemplified by beam prediction using a multimodal large language model. The dataset is open in https://le-liang.github.io/mmw/.

Multimodal-Wireless: A Large-Scale Dataset for Sensing and Communication

TL;DR

The paper tackles the lack of large-scale multimodal datasets that jointly capture sensing data and wireless channel information for context-aware communications and collaborative perception. It presents Multimodal-Wireless, a scalable, open-source dataset produced via an integrated pipeline using CARLA, Blender, and Sionna to synchronize LiDAR, RGB/depth, IMU, radar with high-fidelity CSI at 100 Hz under varied weather. Key contributions include a configurable data-generation framework, comprehensive vehicle-to-everything channel data with LOS and first-order reflections, weather diversity, and a Python utility to synthesize frequency-domain channels from path data using . A case study demonstrates multimodal LLM-based beam prediction, highlighting the value of combining environmental sensing with CSI for improved performance, and the dataset enables broad research in V2X and collaborative perception.

Abstract

This paper presents Multimodal-Wireless, a large-scale open-source dataset for multimodal sensing and communication research. The dataset is generated through an integrated and customizable data pipeline built upon the CARLA simulator and Sionna framework, and features high-resolution communication channel state information (CSI) fully synchronized with five other sensor modalities, namely LiDAR, RGB and depth camera, inertial measurement unit (IMU) and radar, all sampled at 100 Hz. It contains approximately 160,000 frames collected across four virtual towns, sixteen communication scenarios, and three weather conditions. This paper provides a comprehensive overview of the dataset, outlining its key features, overall framework, and technical implementation details. In addition, it explores potential research applications concerning communication and collaborative perception, exemplified by beam prediction using a multimodal large language model. The dataset is open in https://le-liang.github.io/mmw/.

Paper Structure

This paper contains 9 sections, 4 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Cross-platform data generation workflow for the Multimodal-Wireless dataset.
  • Figure 2: Illustration of sensor perception ranges with CAV trajectory.
  • Figure 3: A comparative overview of four standard CARLA simulation environments and their corresponding source models in Blender.
  • Figure 4: Prediction performance of the proposed multimodal LLM-based method compared with the method in bpllm.
  • Figure 5: Ablation study on different input modalities.