Spatio-Temporal 3D Point Clouds from WiFi-CSI Data via Transformer Networks
Tuomas Määttä, Sasan Sharifipour, Miguel Bordallo López, Constantino Álvarez Casado
TL;DR
This work presents a transformer-based architecture that processes temporal Channel State Information data, specifically amplitude and phase, to generate 3D point clouds of indoor environments and demonstrates strong potential for accurate 3D reconstructions and effectively distinguishes between close and distant objects, advancing JC\&S applications for spatial sensing in future wireless networks.
Abstract
Joint communication and sensing (JC\&S) is emerging as a key component in 5G and 6G networks, enabling dynamic adaptation to environmental changes and enhancing contextual awareness for optimized communication. By leveraging real-time environmental data, JC\&S improves resource allocation, reduces latency, and enhances power efficiency, while also supporting simulations and predictive modeling. This makes it a key technology for reactive systems and digital twins. These systems can respond to environmental events in real-time, offering transformative potential in sectors like smart cities, healthcare, and Industry 5.0, where adaptive and multimodal interaction is critical to enhance real-time decision-making. In this work, we present a transformer-based architecture that processes temporal Channel State Information (CSI) data, specifically amplitude and phase, to generate 3D point clouds of indoor environments. The model utilizes a multi-head attention to capture complex spatio-temporal relationships in CSI data and is adaptable to different CSI configurations. We evaluate the architecture on the MM-Fi dataset, using two different protocols to capture human presence in indoor environments. The system demonstrates strong potential for accurate 3D reconstructions and effectively distinguishes between close and distant objects, advancing JC\&S applications for spatial sensing in future wireless networks.
