LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training

Qing Xu; Kun Yuan; Yuxiang Luo; Yuhao Zhai; Wenting Duan; Nassir Navab; Zhen Chen

LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training

Qing Xu, Kun Yuan, Yuxiang Luo, Yuhao Zhai, Wenting Duan, Nassir Navab, Zhen Chen

TL;DR

LapFM introduces a hierarchical concept evolving pre-training framework for laparoscopic segmentation, unifying anatomy, tissue, and instruments under a Laparoscopic Concept Hierarchy and a confidence-driven pseudo-labeling loop to leverage unlabeled data. It builds LapBench-114K by progressively annotating unlabeled images with high-confidence pseudo-labels and uses a transformer-based encoder with a hierarchical mask decoder and hierarchical losses to enable granularity-adaptive segmentation. Extensive experiments across 20 categories show state-of-the-art performance and strong generalization to unseen data, including GynSurg, with expert validation supporting label quality. This work advances universal surgical scene understanding by enabling flexible granularity and prompt-free segmentation in diverse clinical scenarios.

Abstract

Surgical segmentation is pivotal for scene understanding yet remains hindered by annotation scarcity and semantic inconsistency across diverse procedures. Existing approaches typically fine-tune natural foundation models (e.g., SAM) with limited supervision, functioning merely as domain adapters rather than surgical foundation models. Consequently, they struggle to generalize across the vast variability of surgical targets. To bridge this gap, we present LapFM, a foundation model designed to evolve robust segmentation capabilities from massive unlabeled surgical images. Distinct from medical foundation models relying on inefficient self-supervised proxy tasks, LapFM leverages a Hierarchical Concept Evolving Pre-training paradigm. First, we establish a Laparoscopic Concept Hierarchy (LCH) via a hierarchical mask decoder with parent-child query embeddings, unifying diverse entities (i.e., Anatomy, Tissue, and Instrument) into a scalable knowledge structure with cross-granularity semantic consistency. Second, we propose a Confidence-driven Evolving Labeling that iteratively generates and filters pseudo-labels based on hierarchical consistency, progressively incorporating reliable samples from unlabeled images into training. This process yields LapBench-114K, a large-scale benchmark comprising 114K image-mask pairs. Extensive experiments demonstrate that LapFM significantly outperforms state-of-the-art methods, establishing new standards for granularity-adaptive generalization in universal laparoscopic segmentation. The source code is available at https://github.com/xq141839/LapFM.

LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training

TL;DR

Abstract

LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)