Table of Contents
Fetching ...

AURA: Development and Validation of an Augmented Unplanned Removal Alert System using Synthetic ICU Videos

Junhyuk Seo, Hyeyoon Moon, Kyu-Hwan Jung, Namkee Oh, Taerim Kim

TL;DR

Unplanned extubation (UE) poses severe safety risks in ICUs, but real-time video-based monitoring is constrained by privacy concerns around ICU footage. The authors present AURA, a privacy-preserving UE risk detector developed entirely on synthetic ICU videos produced via text-to-video diffusion, using pose estimation to identify collision near airway tubes and agitation via keypoint velocity. The approach achieves near-perfect collision detection ($F1=0.98$) and solid agitation performance ($F1=0.78$), validated by expert clinicians with high reliability, and demonstrates robustness to scale and parameter perturbations. This work shows that synthetic data can enable scalable, reproducible, and privacy-preserving vision-based patient safety monitoring, offering a blueprint for extending to other devices and clinical contexts in ICUs.

Abstract

Unplanned extubation (UE) remains a critical patient safety concern in intensive care units (ICUs), often leading to severe complications or death. Real-time UE detection has been limited, largely due to the ethical and privacy challenges of obtaining annotated ICU video data. We propose Augmented Unplanned Removal Alert (AURA), a vision-based risk detection system developed and validated entirely on a fully synthetic video dataset. By leveraging text-to-video diffusion, we generated diverse and clinically realistic ICU scenarios capturing a range of patient behaviors and care contexts. The system applies pose estimation to identify two high-risk movement patterns: collision, defined as hand entry into spatial zones near airway tubes, and agitation, quantified by the velocity of tracked anatomical keypoints. Expert assessments confirmed the realism of the synthetic data, and performance evaluations showed high accuracy for collision detection and moderate performance for agitation recognition. This work demonstrates a novel pathway for developing privacy-preserving, reproducible patient safety monitoring systems with potential for deployment in intensive care settings.

AURA: Development and Validation of an Augmented Unplanned Removal Alert System using Synthetic ICU Videos

TL;DR

Unplanned extubation (UE) poses severe safety risks in ICUs, but real-time video-based monitoring is constrained by privacy concerns around ICU footage. The authors present AURA, a privacy-preserving UE risk detector developed entirely on synthetic ICU videos produced via text-to-video diffusion, using pose estimation to identify collision near airway tubes and agitation via keypoint velocity. The approach achieves near-perfect collision detection () and solid agitation performance (), validated by expert clinicians with high reliability, and demonstrates robustness to scale and parameter perturbations. This work shows that synthetic data can enable scalable, reproducible, and privacy-preserving vision-based patient safety monitoring, offering a blueprint for extending to other devices and clinical contexts in ICUs.

Abstract

Unplanned extubation (UE) remains a critical patient safety concern in intensive care units (ICUs), often leading to severe complications or death. Real-time UE detection has been limited, largely due to the ethical and privacy challenges of obtaining annotated ICU video data. We propose Augmented Unplanned Removal Alert (AURA), a vision-based risk detection system developed and validated entirely on a fully synthetic video dataset. By leveraging text-to-video diffusion, we generated diverse and clinically realistic ICU scenarios capturing a range of patient behaviors and care contexts. The system applies pose estimation to identify two high-risk movement patterns: collision, defined as hand entry into spatial zones near airway tubes, and agitation, quantified by the velocity of tracked anatomical keypoints. Expert assessments confirmed the realism of the synthetic data, and performance evaluations showed high accuracy for collision detection and moderate performance for agitation recognition. This work demonstrates a novel pathway for developing privacy-preserving, reproducible patient safety monitoring systems with potential for deployment in intensive care settings.

Paper Structure

This paper contains 23 sections, 9 equations, 5 figures, 10 tables.

Figures (5)

  • Figure 1: Overview of the AURA development pipeline. The framework consists of three stages: (1) Synthetic video generation: 75 ICU videos generated using text-to-video model and refined through heuristic evaluation; (2) System development: Pose estimation-based detection of collision and agitation behaviors, with model tuning (n=12) and testing (n=63); (3) System assessment: Nine experts conducted video- and system-level evaluation, establishing ground truth for subsequent performance evaluation.
  • Figure 2: Overview of the AURA synthetic video dataset generation. Prompts are refined via exploratory generation and heuristic expert evaluation, then used for batch video generation. Outputs pass two screening stages and only accepted videos form the final dataset (solid = pass; dashed = feedback/fail). This integrated human–AI workflow ensures realism and feasibility while preserving reproducibility.
  • Figure 3: Examples of overlays for collision (left) and agitation (right) in synthetic intensive care unit videos. Colored “auras” around hands and mouth indicate risk zones (green = normal; red = collision). Numeric overlays show hand proximity (LH, RH), movement velocity (VEL).
  • Figure 4: Prompt template for synthetic ICU video generation. Video generation was performed using an expert-designed prompt template, which was iteratively refined through sample generation and heuristic evaluation by an experienced ICU nurse.
  • Figure 5: Web-based evaluation interface used by ICU nurses. The interface displays each synthetic ICU video with pose-estimated overlays (center), provides structured rating criteria for video quality and alarm appropriateness (right), and allows navigation across all 63 evaluation items (left). Before entering the evaluation page, experts received a brief orientation explaining the system and rating procedure. After the video-level assessment, a system-level evaluation page was presented. This interface was used to collect expert assessments for validating video quality and overall system performance.