Table of Contents
Fetching ...

An Efficient Interaction Human-AI Synergy System Bridging Visual Awareness and Large Language Model for Intensive Care Units

Yibowen Zhao, Yiming Cao, Zhiqi Shen, Juan Du, Yonghui Xu, Lizhen Cui, Cyril Leung

TL;DR

The paper addresses ICU data fragmentation and transcription errors by enabling non-invasive visual data capture from bedside monitors and a semantic, voice-driven interface for querying patient data. It introduces a cloud-edge-end architecture combining edge YOLOv5-based screen detection with OCR to produce structured FHIR records and a cloud-based LLM semantic engine for natural-language interactions. Key contributions include a fully integrated data pipeline, low-latency processing, and a multilingual GUI with a voice assistant that supports hands-free queries. The authors demonstrate the framework in a simulated ICU environment and outline plans for clinical trials to validate usability and impact, highlighting potential to reduce cognitive load and improve patient safety.

Abstract

Intensive Care Units (ICUs) are critical environments characterized by high-stakes monitoring and complex data management. However, current practices often rely on manual data transcription and fragmented information systems, introducing potential risks to patient safety and operational efficiency. To address these issues, we propose a human-AI synergy system based on a cloud-edge-end architecture, which integrates visual-aware data extraction and semantic interaction mechanisms. Specifically, a visual-aware edge module non-invasively captures real-time physiological data from bedside monitors, reducing manual entry errors. To improve accessibility to fragmented data sources, a semantic interaction module, powered by a Large Language Model (LLM), enables physicians to perform efficient and intuitive voice-based queries over structured patient data. The hierarchical cloud-edge-end deployment ensures low-latency communication and scalable system performance. Our system reduces the cognitive burden on ICU nurses and physicians and demonstrates promising potential for broader applications in intelligent healthcare systems.

An Efficient Interaction Human-AI Synergy System Bridging Visual Awareness and Large Language Model for Intensive Care Units

TL;DR

The paper addresses ICU data fragmentation and transcription errors by enabling non-invasive visual data capture from bedside monitors and a semantic, voice-driven interface for querying patient data. It introduces a cloud-edge-end architecture combining edge YOLOv5-based screen detection with OCR to produce structured FHIR records and a cloud-based LLM semantic engine for natural-language interactions. Key contributions include a fully integrated data pipeline, low-latency processing, and a multilingual GUI with a voice assistant that supports hands-free queries. The authors demonstrate the framework in a simulated ICU environment and outline plans for clinical trials to validate usability and impact, highlighting potential to reduce cognitive load and improve patient safety.

Abstract

Intensive Care Units (ICUs) are critical environments characterized by high-stakes monitoring and complex data management. However, current practices often rely on manual data transcription and fragmented information systems, introducing potential risks to patient safety and operational efficiency. To address these issues, we propose a human-AI synergy system based on a cloud-edge-end architecture, which integrates visual-aware data extraction and semantic interaction mechanisms. Specifically, a visual-aware edge module non-invasively captures real-time physiological data from bedside monitors, reducing manual entry errors. To improve accessibility to fragmented data sources, a semantic interaction module, powered by a Large Language Model (LLM), enables physicians to perform efficient and intuitive voice-based queries over structured patient data. The hierarchical cloud-edge-end deployment ensures low-latency communication and scalable system performance. Our system reduces the cognitive burden on ICU nurses and physicians and demonstrates promising potential for broader applications in intelligent healthcare systems.

Paper Structure

This paper contains 14 sections, 5 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Overview of the Proposed Human-AI Synergy System for ICU Monitoring. The system assists nurses by automatically capturing physiological data from bedside monitors via non-invasive screen recognition, and enhances physician interaction by enabling semantic-level query using a large language model interface.
  • Figure 2: Architecture of Visual-Aware ICU Device Detection Module.
  • Figure 3: OCR Pipeline for ICU Screen Text Digitization and FHIR Structuring.
  • Figure 4: System-Level Architecture of Cloud-Edge-End Human-AI Synergy Platform
  • Figure 5: Integrated simulation and bilingual GUI environment for ICU monitoring, combining device interface capture with English and Chinese clinical dashboards.