Table of Contents
Fetching ...

FAWN: A MultiEncoder Fusion-Attention Wave Network for Integrated Sensing and Communication Indoor Scene Inference

Carlos Barroso-Fernández, Alejandro Calvillo-Fernandez, Antonio de la Oliva, Carlos J. Bernardos

TL;DR

FAWN introduces a multi-encoder fusion-attention network to perform ISAC-based indoor scene inference by merging Wi-Fi and 5G passive sensing signals without disrupting ongoing communications. The architecture uses separate encoders for each technology, a self-attention fusion module, and a lightweight decoder to predict object presence and discretized locations, achieving high accuracy and sub-meter localization in challenging indoor scenarios. Experimental results in a warehouse testbed demonstrate that FAWN outperforms single-technology baselines and ablations, with about 84% of cases achieving localization errors below 0.6 m. The work highlights the practicality of cross-technology fusion for robust, low-cost environmental perception and outlines future improvements such as task-specific decoders and temporal modeling for dynamic environments.

Abstract

The upcoming generations of wireless technologies promise an era where everything is interconnected and intelligent. As the need for intelligence grows, networks must learn to better understand the physical world. However, deploying dedicated hardware to perceive the environment is not always feasible, mainly due to costs and/or complexity. Integrated Sensing and Communication (ISAC) has made a step forward in addressing this challenge. Within ISAC, passive sensing emerges as a cost-effective solution that reuses wireless communications to sense the environment, without interfering with existing communications. Nevertheless, the majority of current solutions are limited to one technology (mostly Wi-Fi or 5G), constraining the maximum accuracy reachable. As different technologies work with different spectrums, we see a necessity in integrating more than one technology to augment the coverage area. Hence, we take the advantage of ISAC passive sensing, to present FAWN, a MultiEncoder Fusion-Attention Wave Network for ISAC indoor scene inference. FAWN is based on the original transformers architecture, to fuse information from Wi-Fi and 5G, making the network capable of understanding the physical world without interfering with the current communication. To test our solution, we have built a prototype and integrated it in a real scenario. Results show errors below 0.6 m around 84% of times.

FAWN: A MultiEncoder Fusion-Attention Wave Network for Integrated Sensing and Communication Indoor Scene Inference

TL;DR

FAWN introduces a multi-encoder fusion-attention network to perform ISAC-based indoor scene inference by merging Wi-Fi and 5G passive sensing signals without disrupting ongoing communications. The architecture uses separate encoders for each technology, a self-attention fusion module, and a lightweight decoder to predict object presence and discretized locations, achieving high accuracy and sub-meter localization in challenging indoor scenarios. Experimental results in a warehouse testbed demonstrate that FAWN outperforms single-technology baselines and ablations, with about 84% of cases achieving localization errors below 0.6 m. The work highlights the practicality of cross-technology fusion for robust, low-cost environmental perception and outlines future improvements such as task-specific decoders and temporal modeling for dynamic environments.

Abstract

The upcoming generations of wireless technologies promise an era where everything is interconnected and intelligent. As the need for intelligence grows, networks must learn to better understand the physical world. However, deploying dedicated hardware to perceive the environment is not always feasible, mainly due to costs and/or complexity. Integrated Sensing and Communication (ISAC) has made a step forward in addressing this challenge. Within ISAC, passive sensing emerges as a cost-effective solution that reuses wireless communications to sense the environment, without interfering with existing communications. Nevertheless, the majority of current solutions are limited to one technology (mostly Wi-Fi or 5G), constraining the maximum accuracy reachable. As different technologies work with different spectrums, we see a necessity in integrating more than one technology to augment the coverage area. Hence, we take the advantage of ISAC passive sensing, to present FAWN, a MultiEncoder Fusion-Attention Wave Network for ISAC indoor scene inference. FAWN is based on the original transformers architecture, to fuse information from Wi-Fi and 5G, making the network capable of understanding the physical world without interfering with the current communication. To test our solution, we have built a prototype and integrated it in a real scenario. Results show errors below 0.6 m around 84% of times.

Paper Structure

This paper contains 15 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Scheme of indoor laboratory room. Wi-Fi and signals are captured by passive receivers to sense the environment.
  • Figure 2: The flow of how is extracted from the real environment (top left picture) using the USRPs acting as passive receivers of and Wi-Fi beacons. Then, FAWN (light purple background) use them to infer where the person and robot are (top right). Bottom part depicts the layers of the three building blocks of FAWN: 1) the encoders, 2) the sensor fusion, and 3) the decoder.
  • Figure 3: ECDFs of the position error in meters.
  • Figure 4: Heatmap of the mean error in the different locations of the laboratory room for FAWN.