Surgment: Segmentation-enabled Semantic Search and Creation of Visual Question and Feedback to Support Video-Based Surgery Learning
Jingying Wang, Haoran Tang, Taylor Kantor, Tandis Soltani, Vitaliy Popov, Xu Wang
TL;DR
Surgment addresses the need for active, visual-focused surgical learning by integrating a SegGPT+SAM segmentation pipeline with two key interfaces: search-by-mask for frame retrieval and a quiz-maker for image-based questions and feedback. The approach achieves a Dice score of 0.92 on lap chole datasets and enables high-education-value content validated by 11 expert surgeons, highlighting improvements over traditional, text-centric question generation. The study demonstrates the importance of human expert input in AI-assisted content creation, identifies UI and generalizability challenges, and points to future directions including voice interfaces, finer segmentation, and AR-enabled in-OR teaching. Collectively, Surgment offers a practical pathway to enhance preoperative preparation and surgical training through interactive, visual learning materials grounded in authentic operative scenes.
Abstract
Videos are prominent learning materials to prepare surgical trainees before they enter the operating room (OR). In this work, we explore techniques to enrich the video-based surgery learning experience. We propose Surgment, a system that helps expert surgeons create exercises with feedback based on surgery recordings. Surgment is powered by a few-shot-learning-based pipeline (SegGPT+SAM) to segment surgery scenes, achieving an accuracy of 92\%. The segmentation pipeline enables functionalities to create visual questions and feedback desired by surgeons from a formative study. Surgment enables surgeons to 1) retrieve frames of interest through sketches, and 2) design exercises that target specific anatomical components and offer visual feedback. In an evaluation study with 11 surgeons, participants applauded the search-by-sketch approach for identifying frames of interest and found the resulting image-based questions and feedback to be of high educational value.
