Table of Contents
Fetching ...

When do they StOP?: A First Step Towards Automatically Identifying Team Communication in the Operating Room

Keqi Chen, Lilien Schewski, Vinkle Srivastav, Joël Lavanchy, Didier Mutter, Guido Beldi, Sandra Keller, Nicolas Padoy

TL;DR

This paper tackles automatic identification of team communication in the operating room by introducing Team-OR, a real-surgery, multi-view dataset annotated for Time-out and StOP?-protocols. It proposes a multimodal group-activity detector that fuses global scene features with skeleton-based cues through a lightweight network with dual pooling branches to localize protocol timings in untrimmed videos. The approach achieves state-of-the-art results on Team-OR, with Time-out detection performing exceptionally well and StOP? remaining more challenging, while delivering real-time inference (~33 FPS). This work provides the first dataset and baseline methods for automatic analysis of OR team interactions, enabling advances in computer-assisted surgical workflow and patient safety.

Abstract

Purpose: Surgical performance depends not only on surgeons' technical skills but also on team communication within and across the different professional groups present during the operation. Therefore, automatically identifying team communication in the OR is crucial for patient safety and advances in the development of computer-assisted surgical workflow analysis and intra-operative support systems. To take the first step, we propose a new task of detecting communication briefings involving all OR team members, i.e. the team Time-out and the StOP?-protocol, by localizing their start and end times in video recordings of surgical operations. Methods: We generate an OR dataset of real surgeries, called Team-OR, with more than one hundred hours of surgical videos captured by the multi-view camera system in the OR. The dataset contains temporal annotations of 33 Time-out and 22 StOP?-protocol activities in total. We then propose a novel group activity detection approach, where we encode both scene context and action features, and use an efficient neural network model to output the results. Results: The experimental results on the Team-OR dataset show that our approach outperforms existing state-of-the-art temporal action detection approaches. It also demonstrates the lack of research on group activities in the OR, proving the significance of our dataset. Conclusion: We investigate the Team Time-Out and the StOP?-protocol in the OR, by presenting the first OR dataset with temporal annotations of group activities protocols, and introducing a novel group activity detection approach that outperforms existing approaches. Code is available at https://github.com/CAMMA-public/Team-OR.

When do they StOP?: A First Step Towards Automatically Identifying Team Communication in the Operating Room

TL;DR

This paper tackles automatic identification of team communication in the operating room by introducing Team-OR, a real-surgery, multi-view dataset annotated for Time-out and StOP?-protocols. It proposes a multimodal group-activity detector that fuses global scene features with skeleton-based cues through a lightweight network with dual pooling branches to localize protocol timings in untrimmed videos. The approach achieves state-of-the-art results on Team-OR, with Time-out detection performing exceptionally well and StOP? remaining more challenging, while delivering real-time inference (~33 FPS). This work provides the first dataset and baseline methods for automatic analysis of OR team interactions, enabling advances in computer-assisted surgical workflow and patient safety.

Abstract

Purpose: Surgical performance depends not only on surgeons' technical skills but also on team communication within and across the different professional groups present during the operation. Therefore, automatically identifying team communication in the OR is crucial for patient safety and advances in the development of computer-assisted surgical workflow analysis and intra-operative support systems. To take the first step, we propose a new task of detecting communication briefings involving all OR team members, i.e. the team Time-out and the StOP?-protocol, by localizing their start and end times in video recordings of surgical operations. Methods: We generate an OR dataset of real surgeries, called Team-OR, with more than one hundred hours of surgical videos captured by the multi-view camera system in the OR. The dataset contains temporal annotations of 33 Time-out and 22 StOP?-protocol activities in total. We then propose a novel group activity detection approach, where we encode both scene context and action features, and use an efficient neural network model to output the results. Results: The experimental results on the Team-OR dataset show that our approach outperforms existing state-of-the-art temporal action detection approaches. It also demonstrates the lack of research on group activities in the OR, proving the significance of our dataset. Conclusion: We investigate the Team Time-Out and the StOP?-protocol in the OR, by presenting the first OR dataset with temporal annotations of group activities protocols, and introducing a novel group activity detection approach that outperforms existing approaches. Code is available at https://github.com/CAMMA-public/Team-OR.

Paper Structure

This paper contains 14 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of the Team-OR dataset, consisting of synchronized three ceiling camera views and one laparoscopic view. We blurred the half bodies of the team for privacy concern.
  • Figure 2: The duration distribution of the videos, "Time-out" and "StOP?".
  • Figure 3: Examples of the Time-out and StOP?-protocol activities in the dataset. We blurred the half-bodies of the team for privacy concerns.
  • Figure 4: Framework of our approach. We extract temporal scene context and skeleton features through pretrained VideoMAEv2 wang2023videomae and STGCN++ duan2022pyskl models, and then use a light-weight neural network model to detect the group activities in the OR.