Table of Contents
Fetching ...

Early Operative Difficulty Assessment in Laparoscopic Cholecystectomy via Snapshot-Centric Video Analysis

Saurav Sharma, Maria Vannucci, Leonardo Pestana Legori, Mario Scaglia, Giovanni Guglielmo Laracca, Didier Mutter, Sergio Alfieri, Pietro Mascagni, Nicolas Padoy

TL;DR

This work tackles the problem of early operative difficulty assessment in laparoscopic cholecystectomy by introducing SurgPrOD, a video-based model that analyzes partial intraoperative footage using both global and local snapshots processed through transformers, enhanced by a Snapshot-Centric Attention module. A new dataset, CholeScore, provides video-level LCOD labels across three intraoperative assessment scales, enabling evaluation of early predictions with a novel Earliness Stability metric that emphasizes both promptness and consistency. Empirical results show SurgPrOD, particularly with global+local snapshots and SCA, achieves superior performance over baselines on all scales and metrics, validating the feasibility and value of early LCOD estimation from limited video. The work contributes a practical benchmark, a robust architecture, and a time-aware evaluation framework that can inform intraoperative decision-making and resource planning in real-world surgical settings.

Abstract

Purpose: Laparoscopic cholecystectomy (LC) operative difficulty (LCOD) is highly variable and influences outcomes. Despite extensive LC studies in surgical workflow analysis, limited efforts explore LCOD using intraoperative video data. Early recognition of LCOD could allow prompt review by expert surgeons, enhance operating room (OR) planning, and improve surgical outcomes. Methods: We propose the clinical task of early LCOD assessment using limited video observations. We design SurgPrOD, a deep learning model to assess LCOD by analyzing features from global and local temporal resolutions (snapshots) of the observed LC video. Also, we propose a novel snapshot-centric attention (SCA) module, acting across snapshots, to enhance LCOD prediction. We introduce the CholeScore dataset, featuring video-level LCOD labels to validate our method. Results: We evaluate SurgPrOD on 3 LCOD assessment scales in the CholeScore dataset. On our new metric assessing early and stable correct predictions, SurgPrOD surpasses baselines by at least 0.22 points. SurgPrOD improves over baselines by at least 9 and 5 percentage points in F1 score and top1-accuracy, respectively, demonstrating its effectiveness in correct predictions. Conclusion: We propose a new task for early LCOD assessment and a novel model, SurgPrOD analyzing surgical video from global and local perspectives. Our results on the CholeScore dataset establishes a new benchmark to study LCOD using intraoperative video data.

Early Operative Difficulty Assessment in Laparoscopic Cholecystectomy via Snapshot-Centric Video Analysis

TL;DR

This work tackles the problem of early operative difficulty assessment in laparoscopic cholecystectomy by introducing SurgPrOD, a video-based model that analyzes partial intraoperative footage using both global and local snapshots processed through transformers, enhanced by a Snapshot-Centric Attention module. A new dataset, CholeScore, provides video-level LCOD labels across three intraoperative assessment scales, enabling evaluation of early predictions with a novel Earliness Stability metric that emphasizes both promptness and consistency. Empirical results show SurgPrOD, particularly with global+local snapshots and SCA, achieves superior performance over baselines on all scales and metrics, validating the feasibility and value of early LCOD estimation from limited video. The work contributes a practical benchmark, a robust architecture, and a time-aware evaluation framework that can inform intraoperative decision-making and resource planning in real-world surgical settings.

Abstract

Purpose: Laparoscopic cholecystectomy (LC) operative difficulty (LCOD) is highly variable and influences outcomes. Despite extensive LC studies in surgical workflow analysis, limited efforts explore LCOD using intraoperative video data. Early recognition of LCOD could allow prompt review by expert surgeons, enhance operating room (OR) planning, and improve surgical outcomes. Methods: We propose the clinical task of early LCOD assessment using limited video observations. We design SurgPrOD, a deep learning model to assess LCOD by analyzing features from global and local temporal resolutions (snapshots) of the observed LC video. Also, we propose a novel snapshot-centric attention (SCA) module, acting across snapshots, to enhance LCOD prediction. We introduce the CholeScore dataset, featuring video-level LCOD labels to validate our method. Results: We evaluate SurgPrOD on 3 LCOD assessment scales in the CholeScore dataset. On our new metric assessing early and stable correct predictions, SurgPrOD surpasses baselines by at least 0.22 points. SurgPrOD improves over baselines by at least 9 and 5 percentage points in F1 score and top1-accuracy, respectively, demonstrating its effectiveness in correct predictions. Conclusion: We propose a new task for early LCOD assessment and a novel model, SurgPrOD analyzing surgical video from global and local perspectives. Our results on the CholeScore dataset establishes a new benchmark to study LCOD using intraoperative video data.

Paper Structure

This paper contains 22 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: CholeScore: Sample frames with the associated LCOD findings.
  • Figure 2: Model Overview: (left) SurgPrOD inputs $F_{w}$ observed frames to generate a global snapshot $\mathbf{gs}$ and $k$ local snapshots $\mathbf{ls}_{k}$. MoCoV2 ramesh2023dissecting features are extracted for each snapshot and processed through a transformer $\phi$. Snapshot Centric-Attention (SCA) enhances the $k$ local snapshot features to $\mathcal{F}^{"}_{\mathbf{{ls}_{k}}}$, and together with global snapshot features $\mathcal{F}^{'}_{\mathbf{gs}}$, inputs to a MLP layer to produce class logits and averaged to compute the LCOD class probabilities. (right) The Early Stability (ES) metric (Equation \ref{['es_metric']}), addresses the limitations of traditional metrics by rewarding early (observation window $w$) and stable correct predictions (green circles) within a window step size (n=3, gray boxes). Circles represent observation windows.
  • Figure 3: Class Distribution: Parkland grading scale (P), Nassar (N), and Sugrue (S).
  • Figure 4: Ablation studies on SurgPrOD.
  • Figure 5: Visualization of SCA attention map for Parkland grading scale (PGS), Nassar (N), and Sugrue (S). Models tend to focus on both tools and anatomical structures.
  • ...and 1 more figures