Table of Contents
Fetching ...

Deep-TEMPEST: Using Deep Learning to Eavesdrop on HDMI from its Unintended Electromagnetic Emanations

Santiago Fernández, Emilio Martínez, Gabriel Varela, Pablo Musé, Federico Larroca

TL;DR

This work tackles HDMI TEMPEST eavesdropping by formulating it as an inverse problem and deploying a deep CNN (DRUNet) to recover displayed images directly from complex baseband samples captured by an SDR. It derives and exploits an analytical forward model of HDMI emissions, avoids traditional AM demodulation, and demonstrates substantial CER improvements (over $60$ percentage points) over prior methods, using an open-source dataset of ~3500 samples (simulated and real). The approach jointly optimizes data-driven restoration with a physics-informed forward model, achieving superior text recovery and providing practical countermeasures to mitigate leakage. The contribution includes an accessible dataset and code, enabling broader evaluation and extension of TEMPEST defenses and attacks in digital-display contexts.

Abstract

In this work, we address the problem of eavesdropping on digital video displays by analyzing the electromagnetic waves that unintentionally emanate from the cables and connectors, particularly HDMI. This problem is known as TEMPEST. Compared to the analog case (VGA), the digital case is harder due to a 10-bit encoding that results in a much larger bandwidth and non-linear mapping between the observed signal and the pixel's intensity. As a result, eavesdropping systems designed for the analog case obtain unclear and difficult-to-read images when applied to digital video. The proposed solution is to recast the problem as an inverse problem and train a deep learning module to map the observed electromagnetic signal back to the displayed image. However, this approach still requires a detailed mathematical analysis of the signal, firstly to determine the frequency at which to tune but also to produce training samples without actually needing a real TEMPEST setup. This saves time and avoids the need to obtain these samples, especially if several configurations are being considered. Our focus is on improving the average Character Error Rate in text, and our system improves this rate by over 60 percentage points compared to previous available implementations. The proposed system is based on widely available Software Defined Radio and is fully open-source, seamlessly integrated into the popular GNU Radio framework. We also share the dataset we generated for training, which comprises both simulated and over 1000 real captures. Finally, we discuss some countermeasures to minimize the potential risk of being eavesdropped by systems designed based on similar principles.

Deep-TEMPEST: Using Deep Learning to Eavesdrop on HDMI from its Unintended Electromagnetic Emanations

TL;DR

This work tackles HDMI TEMPEST eavesdropping by formulating it as an inverse problem and deploying a deep CNN (DRUNet) to recover displayed images directly from complex baseband samples captured by an SDR. It derives and exploits an analytical forward model of HDMI emissions, avoids traditional AM demodulation, and demonstrates substantial CER improvements (over percentage points) over prior methods, using an open-source dataset of ~3500 samples (simulated and real). The approach jointly optimizes data-driven restoration with a physics-informed forward model, achieving superior text recovery and providing practical countermeasures to mitigate leakage. The contribution includes an accessible dataset and code, enabling broader evaluation and extension of TEMPEST defenses and attacks in digital-display contexts.

Abstract

In this work, we address the problem of eavesdropping on digital video displays by analyzing the electromagnetic waves that unintentionally emanate from the cables and connectors, particularly HDMI. This problem is known as TEMPEST. Compared to the analog case (VGA), the digital case is harder due to a 10-bit encoding that results in a much larger bandwidth and non-linear mapping between the observed signal and the pixel's intensity. As a result, eavesdropping systems designed for the analog case obtain unclear and difficult-to-read images when applied to digital video. The proposed solution is to recast the problem as an inverse problem and train a deep learning module to map the observed electromagnetic signal back to the displayed image. However, this approach still requires a detailed mathematical analysis of the signal, firstly to determine the frequency at which to tune but also to produce training samples without actually needing a real TEMPEST setup. This saves time and avoids the need to obtain these samples, especially if several configurations are being considered. Our focus is on improving the average Character Error Rate in text, and our system improves this rate by over 60 percentage points compared to previous available implementations. The proposed system is based on widely available Software Defined Radio and is fully open-source, seamlessly integrated into the popular GNU Radio framework. We also share the dataset we generated for training, which comprises both simulated and over 1000 real captures. Finally, we discuss some countermeasures to minimize the potential risk of being eavesdropped by systems designed based on similar principles.
Paper Structure (16 sections, 7 equations, 12 figures, 2 tables)

This paper contains 16 sections, 7 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Proposed system. The HDMI cable and connectors emit unintended electromagnetic signals, which are captured by the SDR and processed by gr-tempest, obtaining a degraded complex-valued image, which in turn is fed to a convolutional neural network to infer the source image. All three images correspond to actual results.
  • Figure 2: An illustration of the transmission of a frame on a single TMDS channel. The red arrow indicates the order in which the signal is transmitted. Video is actually sent only during the video data periods.
  • Figure 3: The power spectral density of a TMDS encoded signal computed by multiplying an estimation of $S_{X_b}(f)$ and $|Q(f)|^2/T_b$ (the dashed red curve, shown for reference); cf. Eq. \ref{['eq:psd_posta']}. Both curves are normalized to its maximum value for clarity. Significant spikes every multiple of $0.1/T_b$ are clearly visible. In the zoom-in around $f=0.3/T_b$ shown below, smaller but nevertheless important spikes every multiple of $1/(P_xT_p)$ (the inverse of the duration of each horizontal line) are also clearly visible.
  • Figure 4: Diagram of an SDR. The drivers provide complex samples $y[l]$ whose real and imaginary parts correspond to the in-phase and quadrature components.
  • Figure 5: Normalized Fourier Transform of $q(t)$ (i.e. Eq. \ref{['eq:ejemplo_q']} with $\epsilon=0.002$) and $g(t)$, the complex baseband representation of the channel as seen by the SDR.
  • ...and 7 more figures