Table of Contents
Fetching ...

CataractCompDetect: Intraoperative Complication Detection in Cataract Surgery

Bhuvan Sachdeva, Sneha Kumari, Rudransh Agarwal, Shalaka Kumaraswamy, Niharika Singri Prasad, Simon Mueller, Raphael Lechtenboehmer, Maximilian W. M. Wintergerst, Thomas Schultz, Kaushik Murali, Mohit Jain

TL;DR

This work tackles the challenging problem of intraoperative complication detection in cataract surgery by introducing CataractCompDetect, a pipeline that fuses phase-aware localization, SAM 2-based tracking, complication-specific risk scoring, and vision-language reasoning. A new real-world dataset, CataComp, with 53 MSICS videos annotated for iris prolapse, posterior capsule rupture, and vitreous loss, enables robust benchmarking. On CataComp, the method achieves an average F1 of 70.63% across complications, with per-type F1 scores of 81.80% (iris prolapse), 60.87% (PCR), and 69.23% (vitreous loss), demonstrating the value of combining structured surgical priors with open-ended visual reasoning. The work highlights practical potential for real-time monitoring and training feedback, and provides the dataset and code to catalyze further research in surgical complication detection.

Abstract

Cataract surgery is one of the most commonly performed surgeries worldwide, yet intraoperative complications such as iris prolapse, posterior capsule rupture (PCR), and vitreous loss remain major causes of adverse outcomes. Automated detection of such events could enable early warning systems and objective training feedback. In this work, we propose CataractCompDetect, a complication detection framework that combines phase-aware localization, SAM 2-based tracking, complication-specific risk scoring, and vision-language reasoning for final classification. To validate CataractCompDetect, we curate CataComp, the first cataract surgery video dataset annotated for intraoperative complications, comprising 53 surgeries, including 23 with clinical complications. On CataComp, CataractCompDetect achieves an average F1 score of 70.63%, with per-complication performance of 81.8% (Iris Prolapse), 60.87% (PCR), and 69.23% (Vitreous Loss). These results highlight the value of combining structured surgical priors with vision-language reasoning for recognizing rare but high-impact intraoperative events. Our dataset and code will be publicly released upon acceptance.

CataractCompDetect: Intraoperative Complication Detection in Cataract Surgery

TL;DR

This work tackles the challenging problem of intraoperative complication detection in cataract surgery by introducing CataractCompDetect, a pipeline that fuses phase-aware localization, SAM 2-based tracking, complication-specific risk scoring, and vision-language reasoning. A new real-world dataset, CataComp, with 53 MSICS videos annotated for iris prolapse, posterior capsule rupture, and vitreous loss, enables robust benchmarking. On CataComp, the method achieves an average F1 of 70.63% across complications, with per-type F1 scores of 81.80% (iris prolapse), 60.87% (PCR), and 69.23% (vitreous loss), demonstrating the value of combining structured surgical priors with open-ended visual reasoning. The work highlights practical potential for real-time monitoring and training feedback, and provides the dataset and code to catalyze further research in surgical complication detection.

Abstract

Cataract surgery is one of the most commonly performed surgeries worldwide, yet intraoperative complications such as iris prolapse, posterior capsule rupture (PCR), and vitreous loss remain major causes of adverse outcomes. Automated detection of such events could enable early warning systems and objective training feedback. In this work, we propose CataractCompDetect, a complication detection framework that combines phase-aware localization, SAM 2-based tracking, complication-specific risk scoring, and vision-language reasoning for final classification. To validate CataractCompDetect, we curate CataComp, the first cataract surgery video dataset annotated for intraoperative complications, comprising 53 surgeries, including 23 with clinical complications. On CataComp, CataractCompDetect achieves an average F1 score of 70.63%, with per-complication performance of 81.8% (Iris Prolapse), 60.87% (PCR), and 69.23% (Vitreous Loss). These results highlight the value of combining structured surgical priors with vision-language reasoning for recognizing rare but high-impact intraoperative events. Our dataset and code will be publicly released upon acceptance.

Paper Structure

This paper contains 20 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Overview of CataractCompDetect for complication detection in cataract surgery. It utilizes SAM 2 for tracking and segmentation, a Risk Scoring block to identify high-risk frames, and a Vision-Language Model (VLM) for final classification of complications like Iris Prolapse.
  • Figure 2: Anatomy of the eye
  • Figure 3: Examples of intraoperative anomalies in cataract surgery (highlighted by bounding boxes): (A) Iris Prolapse: identified by brownish iris tissue protruding through the surgical incision, (B) Posterior Capsule Rupture (PCR): visible as a tear within the posterior capsule, and (C) Vitreous Loss: characterized by a forward bulge of the pupil and/or translucent web-like vitreous strands extending through the pupillary plane.
  • Figure 4: Overview of the CataractCompDetect pipeline. The framework integrates surgical phase-aware localization, SAM 2-based tracking of pupil and iris, complication-specific risk assessment, and vision-language reasoning to detect and classify intraoperative complications.
  • Figure 5: Representative frames that passed the Risk Scoring stage and were assigned a positive label by the VLM. True positives: Iris Prolapse (A), PCR (B), and Vitreous Loss (C). False positives: Iris Prolapse (D), PCR (E), and Vitreous Loss (F).