Table of Contents
Fetching ...

Real-Time Mobile Video Analytics for Pre-arrival Emergency Medical Services

Liuyi Jin, Amran Haroon, Radu Stoleru, Pasan Gunawardena, Michael Middleton, Jeeeun Kim

TL;DR

TeleEMS tackles the bottlenecks of traditional EMS data flows by introducing a mobile, edge-based system that fuses audio, video, and text into four actionable pre-arrival decisions. It combines EMS-Stream for open, multi-party streaming, EMSLlama for real-time symptom extraction, and PreNet for multimodal inference to produce protocol, medication, quantity, and procedures. The approach is validated through a data-driven pipeline built from NEMSIS and synthetic video datasets, demonstrating robustness and competitive performance relative to GPT-4o while offering determinism and edge-resilience. The work promises disaster-resilient, scalable, next-generation pre-arrival EMS infrastructure that can bridge bystanders, dispatchers, and EMTs with timely, lifesaving guidance.

Abstract

Timely and accurate pre-arrival video streaming and analytics are critical for emergency medical services (EMS) to deliver life-saving interventions. Yet, current-generation EMS infrastructure remains constrained by one-to-one video streaming and limited analytics capabilities, leaving dispatchers and EMTs to manually interpret overwhelming, often noisy or redundant information in high-stress environments. We present TeleEMS, a mobile live video analytics system that enables pre-arrival multimodal inference by fusing audio and video into a unified decision-making pipeline before EMTs arrive on scene. TeleEMS comprises two key components: TeleEMS Client and TeleEMS Server. The TeleEMS Client runs across phones, smart glasses, and desktops to support bystanders, EMTs en route, and 911 dispatchers. The TeleEMS Server, deployed at the edge, integrates EMS-Stream, a communication backbone that enables smooth multi-party video streaming. On top of EMSStream, the server hosts three real-time analytics modules: (1) audio-to-symptom analytics via EMSLlama, a domain-specialized LLM for robust symptom extraction and normalization; (2) video-to-vital analytics using state-of-the-art rPPG methods for heart rate estimation; and (3) joint text-vital analytics via PreNet, a multimodal multitask model predicting EMS protocols, medication types, medication quantities, and procedures. Evaluation shows that EMSLlama outperforms GPT-4o (exact-match 0.89 vs. 0.57) and that text-vital fusion improves inference robustness, enabling reliable pre-arrival intervention recommendations. TeleEMS demonstrates the potential of mobile live video analytics to transform EMS operations, bridging the gap between bystanders, dispatchers, and EMTs, and paving the way for next-generation intelligent EMS infrastructure.

Real-Time Mobile Video Analytics for Pre-arrival Emergency Medical Services

TL;DR

TeleEMS tackles the bottlenecks of traditional EMS data flows by introducing a mobile, edge-based system that fuses audio, video, and text into four actionable pre-arrival decisions. It combines EMS-Stream for open, multi-party streaming, EMSLlama for real-time symptom extraction, and PreNet for multimodal inference to produce protocol, medication, quantity, and procedures. The approach is validated through a data-driven pipeline built from NEMSIS and synthetic video datasets, demonstrating robustness and competitive performance relative to GPT-4o while offering determinism and edge-resilience. The work promises disaster-resilient, scalable, next-generation pre-arrival EMS infrastructure that can bridge bystanders, dispatchers, and EMTs with timely, lifesaving guidance.

Abstract

Timely and accurate pre-arrival video streaming and analytics are critical for emergency medical services (EMS) to deliver life-saving interventions. Yet, current-generation EMS infrastructure remains constrained by one-to-one video streaming and limited analytics capabilities, leaving dispatchers and EMTs to manually interpret overwhelming, often noisy or redundant information in high-stress environments. We present TeleEMS, a mobile live video analytics system that enables pre-arrival multimodal inference by fusing audio and video into a unified decision-making pipeline before EMTs arrive on scene. TeleEMS comprises two key components: TeleEMS Client and TeleEMS Server. The TeleEMS Client runs across phones, smart glasses, and desktops to support bystanders, EMTs en route, and 911 dispatchers. The TeleEMS Server, deployed at the edge, integrates EMS-Stream, a communication backbone that enables smooth multi-party video streaming. On top of EMSStream, the server hosts three real-time analytics modules: (1) audio-to-symptom analytics via EMSLlama, a domain-specialized LLM for robust symptom extraction and normalization; (2) video-to-vital analytics using state-of-the-art rPPG methods for heart rate estimation; and (3) joint text-vital analytics via PreNet, a multimodal multitask model predicting EMS protocols, medication types, medication quantities, and procedures. Evaluation shows that EMSLlama outperforms GPT-4o (exact-match 0.89 vs. 0.57) and that text-vital fusion improves inference robustness, enabling reliable pre-arrival intervention recommendations. TeleEMS demonstrates the potential of mobile live video analytics to transform EMS operations, bridging the gap between bystanders, dispatchers, and EMTs, and paving the way for next-generation intelligent EMS infrastructure.

Paper Structure

This paper contains 29 sections, 3 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: TeleEMS application scenario example. The TeleEMS Server runs in the data center or on host machines of the cellular Core network, connecting three TeleEMS Clients: a bystander witnessing an unconscious patient, a 911 operator, and a dispatched EMT in the ambulance.
  • Figure 2: Overview of TeleEMS system architecture.
  • Figure 3: EMS-Stream includes a WebRTC Gateway and RTP to video/audio converters. Inside the WebRTC gateway, a Janus server with public local IP and two processes for video room and text room, respectively.
  • Figure 4: Example of using NEMSIS to prepare pre-arrival symptom texts and vitals used in TeleEMS.
  • Figure 5: Speech-to-text and symptom extractor pipeline in TeleEMS (left) and corresponding symptoms in the EMT-bystander conversation (right).
  • ...and 5 more figures