Table of Contents
Fetching ...

OPTED: Open Preprocessed Trachoma Eye Dataset Using Zero-Shot SAM 3 Segmentation

Kibrom Gebremedhin, Hadush Hailu, Bruk Gebregziabher

Abstract

Trachoma remains the leading infectious cause of blindness worldwide, with Sub-Saharan Africa bearing over 85% of the global burden and Ethiopia alone accounting for more than half of all cases. Yet publicly available preprocessed datasets for automated trachoma classification are scarce, and none originate from the most affected region. Raw clinical photographs of eyelids contain significant background noise that hinders direct use in machine learning pipelines. We present OPTED, an open-source preprocessed trachoma eye dataset constructed using the Segment Anything Model 3 (SAM 3) for automated region-of-interest extraction. We describe a reproducible four-step pipeline: (1) text-prompt-based zero-shot segmentation of the tarsal conjunctiva using SAM 3, (2) background removal and bounding-box cropping with alignment, (3) quality filtering based on confidence scores, and (4) Lanczos resizing to 224x224 pixels. A separate prompt-selection stage identifies the optimal text prompt, and manual quality assurance verifies outputs. Through comparison of five candidate prompts on all 2,832 known-label images, we identify "inner surface of eyelid with red tissue" as optimal, achieving a mean confidence of 0.872 (std 0.070) and 99.5% detection rate (the remaining 13 images are recovered via fallback prompts). The pipeline produces outputs in two formats: cropped and aligned images preserving the original aspect ratio, and standardized 224x224 images ready for pre-trained architectures. The OPTED dataset, preprocessing code, and all experimental artifacts are released as open source to facilitate reproducible trachoma classification research.

OPTED: Open Preprocessed Trachoma Eye Dataset Using Zero-Shot SAM 3 Segmentation

Abstract

Trachoma remains the leading infectious cause of blindness worldwide, with Sub-Saharan Africa bearing over 85% of the global burden and Ethiopia alone accounting for more than half of all cases. Yet publicly available preprocessed datasets for automated trachoma classification are scarce, and none originate from the most affected region. Raw clinical photographs of eyelids contain significant background noise that hinders direct use in machine learning pipelines. We present OPTED, an open-source preprocessed trachoma eye dataset constructed using the Segment Anything Model 3 (SAM 3) for automated region-of-interest extraction. We describe a reproducible four-step pipeline: (1) text-prompt-based zero-shot segmentation of the tarsal conjunctiva using SAM 3, (2) background removal and bounding-box cropping with alignment, (3) quality filtering based on confidence scores, and (4) Lanczos resizing to 224x224 pixels. A separate prompt-selection stage identifies the optimal text prompt, and manual quality assurance verifies outputs. Through comparison of five candidate prompts on all 2,832 known-label images, we identify "inner surface of eyelid with red tissue" as optimal, achieving a mean confidence of 0.872 (std 0.070) and 99.5% detection rate (the remaining 13 images are recovered via fallback prompts). The pipeline produces outputs in two formats: cropped and aligned images preserving the original aspect ratio, and standardized 224x224 images ready for pre-trained architectures. The OPTED dataset, preprocessing code, and all experimental artifacts are released as open source to facilitate reproducible trachoma classification research.
Paper Structure (20 sections, 9 figures, 3 tables)

This paper contains 20 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: The OPTED preprocessing pipeline. Raw eyelid photographs are processed through four stages: (1) SAM 3 text-prompt segmentation, (2) bounding-box cropping, (3) alignment, and (4) Lanczos resizing to $224 \times 224$ px. The pipeline converts 2,832 source images into classification-ready samples across three WHO grades (Normal, TF, TI).
  • Figure 2: Overview of the OPTED preprocessing pipeline. Raw eyelid photographs are processed through SAM 3 text-prompt segmentation, background removal, bounding-box cropping with 5% padding, horizontal alignment, and Lanczos resizing to $224 \times 224$ pixels.
  • Figure 3: Visual comparison of SAM 3 masks from the five candidate prompts on three sample images (two Normal, one Trachoma). Blue overlay indicates the predicted mask; yellow contours delineate boundaries. The selected prompt P5 (green border) provides the most complete coverage of the tarsal conjunctiva.
  • Figure 4: Radar chart comparing the five candidate prompts across five metrics: detection rate, mean confidence, consistency (1 -- std), mask area ratio, and fewest misses. The winning prompt P5 ("inner surface of eyelid with red tissue") dominates on confidence, consistency, and mask coverage.
  • Figure 5: Step-by-step visualization of the OPTED pipeline on three sample images (two Normal, one Trachoma). From left to right: raw photograph, SAM 3 mask overlay, background removed, bounding-box crop with horizontal alignment, and final $224 \times 224$ output.
  • ...and 4 more figures