Table of Contents
Fetching ...

Spatio-Temporal 3D Point Clouds from WiFi-CSI Data via Transformer Networks

Tuomas Määttä, Sasan Sharifipour, Miguel Bordallo López, Constantino Álvarez Casado

TL;DR

This work presents a transformer-based architecture that processes temporal Channel State Information data, specifically amplitude and phase, to generate 3D point clouds of indoor environments and demonstrates strong potential for accurate 3D reconstructions and effectively distinguishes between close and distant objects, advancing JC\&S applications for spatial sensing in future wireless networks.

Abstract

Joint communication and sensing (JC\&S) is emerging as a key component in 5G and 6G networks, enabling dynamic adaptation to environmental changes and enhancing contextual awareness for optimized communication. By leveraging real-time environmental data, JC\&S improves resource allocation, reduces latency, and enhances power efficiency, while also supporting simulations and predictive modeling. This makes it a key technology for reactive systems and digital twins. These systems can respond to environmental events in real-time, offering transformative potential in sectors like smart cities, healthcare, and Industry 5.0, where adaptive and multimodal interaction is critical to enhance real-time decision-making. In this work, we present a transformer-based architecture that processes temporal Channel State Information (CSI) data, specifically amplitude and phase, to generate 3D point clouds of indoor environments. The model utilizes a multi-head attention to capture complex spatio-temporal relationships in CSI data and is adaptable to different CSI configurations. We evaluate the architecture on the MM-Fi dataset, using two different protocols to capture human presence in indoor environments. The system demonstrates strong potential for accurate 3D reconstructions and effectively distinguishes between close and distant objects, advancing JC\&S applications for spatial sensing in future wireless networks.

Spatio-Temporal 3D Point Clouds from WiFi-CSI Data via Transformer Networks

TL;DR

This work presents a transformer-based architecture that processes temporal Channel State Information data, specifically amplitude and phase, to generate 3D point clouds of indoor environments and demonstrates strong potential for accurate 3D reconstructions and effectively distinguishes between close and distant objects, advancing JC\&S applications for spatial sensing in future wireless networks.

Abstract

Joint communication and sensing (JC\&S) is emerging as a key component in 5G and 6G networks, enabling dynamic adaptation to environmental changes and enhancing contextual awareness for optimized communication. By leveraging real-time environmental data, JC\&S improves resource allocation, reduces latency, and enhances power efficiency, while also supporting simulations and predictive modeling. This makes it a key technology for reactive systems and digital twins. These systems can respond to environmental events in real-time, offering transformative potential in sectors like smart cities, healthcare, and Industry 5.0, where adaptive and multimodal interaction is critical to enhance real-time decision-making. In this work, we present a transformer-based architecture that processes temporal Channel State Information (CSI) data, specifically amplitude and phase, to generate 3D point clouds of indoor environments. The model utilizes a multi-head attention to capture complex spatio-temporal relationships in CSI data and is adaptable to different CSI configurations. We evaluate the architecture on the MM-Fi dataset, using two different protocols to capture human presence in indoor environments. The system demonstrates strong potential for accurate 3D reconstructions and effectively distinguishes between close and distant objects, advancing JC\&S applications for spatial sensing in future wireless networks.

Paper Structure

This paper contains 17 sections, 5 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Magnitude (left) and unwrapped phase (right) of CSI data from a receiver antenna using the 802.11n WiFi standard at 5GHz with 40MHz bandwidth, across 114 subcarriers and 10 time slices. The plots illustrate signal strength variations and phase shifts influenced by environmental factors, critical for spatial and temporal channel analysis.
  • Figure 2: Architecture of the CSI2PointCloud Model, from input CSI data to 3D point cloud output, highlighting key stages of data transformation and encoding through transformer layers.
  • Figure 3: Comparison between the Ground Truth Point Cloud (left) and the Predicted Point Cloud (right) for Frame 7 of Subject S25 and Action A02. The ground truth shows the actual 3D structure, while the prediction captures general patterns with some differences in object positioning and density.
  • Figure 4: Predictions using the Room-Split Protocol. The upper part shows Subject S30 in Room 2 (E03), predicted by a model trained on Room 1 data. The lower part shows Subject S04 in Room 1 (E01), predicted by a model trained on Room 2 data.