Table of Contents
Fetching ...

WaLi: Can Pressure Sensors in HVAC Systems Capture Human Speech?

Tarikul Islam Tamiti, Biraj Joshi, Rida Hasan, Anomadarshi Barua

TL;DR

WaLi reconstructs intelligible speech from the low-resolution and noisy pressure sensor data with the following technical contributions: WaLi reconstructs intelligible speech from a minimum of 0.5 kHz sampling frequency of pressure sensors, whereas previous work can only detect hot words/phrases.

Abstract

Pressure sensors are an integrated component of modern Heating, Ventilation, and Air Conditioning (HVAC) systems. As these pressure sensors operate within the 0-10 Pa range, support high sampling frequencies of 0.5-2 kHz, and are often placed close to human proximity, they can be used to eavesdrop on confidential speech, since human speech has a similar audible range of 0-10 Pa and a bandwidth of 4 kHz for intelligible quality. This paper presents WaLi, which reconstructs intelligible speech from the low-resolution and noisy pressure sensor data with the following technical contributions: (i) WaLi reconstructs intelligible speech from a minimum of 0.5 kHz sampling frequency of pressure sensors, whereas previous work can only detect hot words/phrases. WaLi uses a complex-valued conformer and Complex Global Attention Block (CGAB) to capture inter-phoneme and intra-phoneme dependencies that exist in the low-resolution pressure sensor data. (ii) WaLi handles the transient noise injected from HVAC fans and duct vibrations by reconstructing both the clean magnitude and phase of the missing frequencies of the low-frequency aliased components. We evaluate our attack on practical HVAC systems located in two anonymous industrial facilities. Extensive studies on real-world pressure sensors show an LSD of 1.24 and an NISQA-MOS of 1.78 for 0.5 kHz to 8 kHz upsampling. We believe that such levels of accuracy pose a significant threat when viewed from a privacy perspective that has not been addressed before for pressure sensors. We also provide defenses for the attack.

WaLi: Can Pressure Sensors in HVAC Systems Capture Human Speech?

TL;DR

WaLi reconstructs intelligible speech from the low-resolution and noisy pressure sensor data with the following technical contributions: WaLi reconstructs intelligible speech from a minimum of 0.5 kHz sampling frequency of pressure sensors, whereas previous work can only detect hot words/phrases.

Abstract

Pressure sensors are an integrated component of modern Heating, Ventilation, and Air Conditioning (HVAC) systems. As these pressure sensors operate within the 0-10 Pa range, support high sampling frequencies of 0.5-2 kHz, and are often placed close to human proximity, they can be used to eavesdrop on confidential speech, since human speech has a similar audible range of 0-10 Pa and a bandwidth of 4 kHz for intelligible quality. This paper presents WaLi, which reconstructs intelligible speech from the low-resolution and noisy pressure sensor data with the following technical contributions: (i) WaLi reconstructs intelligible speech from a minimum of 0.5 kHz sampling frequency of pressure sensors, whereas previous work can only detect hot words/phrases. WaLi uses a complex-valued conformer and Complex Global Attention Block (CGAB) to capture inter-phoneme and intra-phoneme dependencies that exist in the low-resolution pressure sensor data. (ii) WaLi handles the transient noise injected from HVAC fans and duct vibrations by reconstructing both the clean magnitude and phase of the missing frequencies of the low-frequency aliased components. We evaluate our attack on practical HVAC systems located in two anonymous industrial facilities. Extensive studies on real-world pressure sensors show an LSD of 1.24 and an NISQA-MOS of 1.78 for 0.5 kHz to 8 kHz upsampling. We believe that such levels of accuracy pose a significant threat when viewed from a privacy perspective that has not been addressed before for pressure sensors. We also provide defenses for the attack.

Paper Structure

This paper contains 41 sections, 9 equations, 15 figures, 13 tables.

Figures (15)

  • Figure 1: (Left) Internals of a DPS. (Right) The accessories connected with DPSs (part# SETRA264) setra264_datasheet in a real-world HVAC system.
  • Figure 2: (Left) Clean speech captured by SDP 1108 in the absence of noise. (Right) Noisy speech captured by SDP 1108 in the presence of transient noise.
  • Figure 3: (Left) A brief overview of the threat model - WaLi. The victim is unknowingly talking close to the pressure pickup device.
  • Figure 4: (Left & Middle) Pressure ports are located at hallway entrance and corridor of a cleanroom, and (Right) inside rooms of an industrial facility.
  • Figure 5: WaLi has complex-valued encoders, decoders, complex-valued skip blocks, CGAB, and complex multiresolution STFT loss.
  • ...and 10 more figures