Table of Contents
Fetching ...

CardioLive: Empowering Video Streaming with Online Cardiac Monitoring

Sheng Lyu, Ruiming Huang, Sijie Ji, Yasar Abbas Ur Rehman, Lan Ma, Chenshu Wu

TL;DR

CardioLive presents a pioneering online cardiac monitoring system that fuses coexisting video and audio streams to infer heart rate during real-time video streaming. The core is CardioNet, an audio-visual network with a video branch featuring temporal-differencing and frequency-aware modules and an audio branch using raw waveforms with learnable temporal-frequency filters, fused through a multi-head attention mechanism. The system is designed as a plug-and-play middleware capable of operating in edge or cloud environments, including robust buffering and synchronization to handle changing FPS and unsynchronized streams. Empirical results show CardioLive achieving an average MAE of 1.79 BPM, outperforming single-modality baselines by substantial margins, and delivering real-time throughput (e.g., 115.97 FPS on Zoom and 98.16 FPS on YouTube) with modest latency, demonstrating practical viability for health, affective computing, and security applications in streaming platforms.

Abstract

Online Cardiac Monitoring (OCM) emerges as a compelling enhancement for the next-generation video streaming platforms. It enables various applications including remote health, online affective computing, and deepfake detection. Yet the physiological information encapsulated in the video streams has been long neglected. In this paper, we present the design and implementation of CardioLive, the first online cardiac monitoring system in video streaming platforms. We leverage the naturally co-existed video and audio streams and devise CardioNet, the first audio-visual network to learn the cardiac series. It incorporates multiple unique designs to extract temporal and spectral features, ensuring robust performance under realistic video streaming conditions. To enable the Service-On-Demand online cardiac monitoring, we implement CardioLive as a plug-and-play middleware service and develop systematic solutions to practical issues including changing FPS and unsynchronized streams. Extensive experiments have been done to demonstrate the effectiveness of our system. We achieve a Mean Square Error (MAE) of 1.79 BPM error, outperforming the video-only and audio-only solutions by 69.2% and 81.2%, respectively. Our CardioLive service achieves average throughputs of 115.97 and 98.16 FPS when implemented in Zoom and YouTube. We believe our work opens up new applications for video stream systems. We will release the code soon.

CardioLive: Empowering Video Streaming with Online Cardiac Monitoring

TL;DR

CardioLive presents a pioneering online cardiac monitoring system that fuses coexisting video and audio streams to infer heart rate during real-time video streaming. The core is CardioNet, an audio-visual network with a video branch featuring temporal-differencing and frequency-aware modules and an audio branch using raw waveforms with learnable temporal-frequency filters, fused through a multi-head attention mechanism. The system is designed as a plug-and-play middleware capable of operating in edge or cloud environments, including robust buffering and synchronization to handle changing FPS and unsynchronized streams. Empirical results show CardioLive achieving an average MAE of 1.79 BPM, outperforming single-modality baselines by substantial margins, and delivering real-time throughput (e.g., 115.97 FPS on Zoom and 98.16 FPS on YouTube) with modest latency, demonstrating practical viability for health, affective computing, and security applications in streaming platforms.

Abstract

Online Cardiac Monitoring (OCM) emerges as a compelling enhancement for the next-generation video streaming platforms. It enables various applications including remote health, online affective computing, and deepfake detection. Yet the physiological information encapsulated in the video streams has been long neglected. In this paper, we present the design and implementation of CardioLive, the first online cardiac monitoring system in video streaming platforms. We leverage the naturally co-existed video and audio streams and devise CardioNet, the first audio-visual network to learn the cardiac series. It incorporates multiple unique designs to extract temporal and spectral features, ensuring robust performance under realistic video streaming conditions. To enable the Service-On-Demand online cardiac monitoring, we implement CardioLive as a plug-and-play middleware service and develop systematic solutions to practical issues including changing FPS and unsynchronized streams. Extensive experiments have been done to demonstrate the effectiveness of our system. We achieve a Mean Square Error (MAE) of 1.79 BPM error, outperforming the video-only and audio-only solutions by 69.2% and 81.2%, respectively. Our CardioLive service achieves average throughputs of 115.97 and 98.16 FPS when implemented in Zoom and YouTube. We believe our work opens up new applications for video stream systems. We will release the code soon.

Paper Structure

This paper contains 22 sections, 7 equations, 23 figures, 2 tables.

Figures (23)

  • Figure 1: Online Cardiac Monitoring (OCM).
  • Figure 2: Kinetics of Cardiac Learning.
  • Figure 3: The performances of video-based approaches vary under different body movements and light conditions.
  • Figure 4: Overall Illustration of CardioNet
  • Figure 5: Video Encoder
  • ...and 18 more figures