Table of Contents
Fetching ...

SinkSAM: A Monocular Depth-Guided SAM Framework for Automatic Sinkhole Segmentation

Osher Rafaeli, Tal Svoray, Ariel Nahlieli

TL;DR

A novel framework for sinkhole segmentation that combines traditional topographic computations of closed depressions with the newly developed prompt-based Segment Anything Model (SAM), which presents the first SAM implementation for sinkhole segmentation and demonstrates the robustness of SinkSAM in extracting sinkhole maps using a single RGB image.

Abstract

Soil sinkholes significantly influence soil degradation, but their irregular shapes, along with interference from shadow and vegetation, make it challenging to accurately quantify their properties using remotely sensed data. We present a novel framework for sinkhole segmentation that combines traditional topographic computations of closed depressions with the newly developed prompt-based Segment Anything Model (SAM). Within this framework, termed SinkSAM, we highlight four key improvements: (1) The integration of topographic computations with SAM enables pixel-level refinement of sinkhole boundaries segmentation; (2) A coherent mathematical prompting strategy, based on closed depressions, addresses the limitations of purely learning-based models (CNNs) in detecting and segmenting undefined sinkhole features, while improving generalization to new, unseen regions; (3) Using Depth Anything V2 monocular depth for automatic prompts eliminates photogrammetric biases, enabling sinkhole mapping without the dependence on LiDAR data; and (4) An established sinkhole database facilitates fine-tuning of SAM, improving its zero-shot performance in sinkhole segmentation. These advancements allow the deployment of SinkSAM, in an unseen test area, in the highly variable semiarid region, achieving an intersection-over-union (IoU) of 40.27\% and surpassing previous results. This paper also presents the first SAM implementation for sinkhole segmentation and demonstrates the robustness of SinkSAM in extracting sinkhole maps using a single RGB image.

SinkSAM: A Monocular Depth-Guided SAM Framework for Automatic Sinkhole Segmentation

TL;DR

A novel framework for sinkhole segmentation that combines traditional topographic computations of closed depressions with the newly developed prompt-based Segment Anything Model (SAM), which presents the first SAM implementation for sinkhole segmentation and demonstrates the robustness of SinkSAM in extracting sinkhole maps using a single RGB image.

Abstract

Soil sinkholes significantly influence soil degradation, but their irregular shapes, along with interference from shadow and vegetation, make it challenging to accurately quantify their properties using remotely sensed data. We present a novel framework for sinkhole segmentation that combines traditional topographic computations of closed depressions with the newly developed prompt-based Segment Anything Model (SAM). Within this framework, termed SinkSAM, we highlight four key improvements: (1) The integration of topographic computations with SAM enables pixel-level refinement of sinkhole boundaries segmentation; (2) A coherent mathematical prompting strategy, based on closed depressions, addresses the limitations of purely learning-based models (CNNs) in detecting and segmenting undefined sinkhole features, while improving generalization to new, unseen regions; (3) Using Depth Anything V2 monocular depth for automatic prompts eliminates photogrammetric biases, enabling sinkhole mapping without the dependence on LiDAR data; and (4) An established sinkhole database facilitates fine-tuning of SAM, improving its zero-shot performance in sinkhole segmentation. These advancements allow the deployment of SinkSAM, in an unseen test area, in the highly variable semiarid region, achieving an intersection-over-union (IoU) of 40.27\% and surpassing previous results. This paper also presents the first SAM implementation for sinkhole segmentation and demonstrates the robustness of SinkSAM in extracting sinkhole maps using a single RGB image.
Paper Structure (24 sections, 2 equations, 14 figures, 1 table)

This paper contains 24 sections, 2 equations, 14 figures, 1 table.

Figures (14)

  • Figure 1: Venn diagram illustrating the SinkSAM approach of merging Computational Topographic Models ("fill sinks") and purely learning-based computer vision models, using prompt technology.
  • Figure 2: The study area is located in the northwestern part of the Negev region, Israel. As shown below, the drone orthomosaic covers three sites prone to soil piping: Tel Gama and Asaf (Training), and Yaen (Test).
  • Figure 3: Input data: Patches of RGB, ground truth annotated sinkholes, photogrammetric DEM and monocular depth estimated by DAV2. As can be seen in this patch, DAV2 correctly delineates sinkhole boundaries, while in photogrammetric DEMs, small sinkholes are neglected.
  • Figure 4: SinkSAM framework: Stage 1: Depth estimation from an RGB image using DAV2. Stage 2: "Fill sinks" techniques and a substraction of estimated depth from a sink-free raster resulting in delineation of closed depression. Stage 3: Prompts generation: a threshold value is used to remove small sinks and create bounding boxes for SAM. Finally, at Stage 4, the SAM tuned model utilizes an image encoder and mask decoder to create a final sinkholes map.
  • Figure 5: Experimental setup: Each comparison was designed to test the performance of SinkSAM Framework: (A) closed depressions, identified through the Depth-based closed depression method vs sinkholes predicted by SAM, using these prompts. This comparison tests the entire SinkSAM framework against DEM-based method; (B) prompt source: SAM prompted by closed depressions bounding boxes vs SAM prompted by YOLOv9 bounding boxes; (C) comparing DAV2 vs photogrammetric DEMs as elevation/depth input layers for the "fill sinks" algorithm; finally, (D) performance of SAM is compared with performance of tuned SinkSAM using bounding box prompts derived from DAV2 closed depressions (CDs - Closed Depressions).
  • ...and 9 more figures