Table of Contents
Fetching ...

TFM Dataset: A Novel Multi-task Dataset and Integrated Pipeline for Automated Tear Film Break-Up Segmentation

Guangrong Wan, Jun liu, Qiyang Zhou, Tang tang, Lianghao Shi, Wenjun Luo, TingTing Xu

TL;DR

DES diagnosis relies on objective TFBU analysis, but public multi-task datasets and integrated pipelines were lacking. The authors introduce the Tear Film Multi-task Dataset (TFM) with 6,247 frames from 15 videos annotated for classification, detection, and segmentation, and propose TF-Net as an efficient TFBU segmentation backbone plus TF-Collab as a real-time, multi-task pipeline. They demonstrate state-of-the-art segmentation performance, robust end-to-end processing, and real-time inference on mobile hardware, with rigorous ablations validating design choices such as ROI cropping and class-weighted losses. The work provides a practical foundation for automated, objective ocular surface diagnostics and opens avenues for unified multi-task learning and broader clinical validation.

Abstract

Tear film break-up (TFBU) analysis is critical for diagnosing dry eye syndrome, but automated TFBU segmentation remains challenging due to the lack of annotated datasets and integrated solutions. This paper introduces the Tear Film Multi-task (TFM) Dataset, the first comprehensive dataset for multi-task tear film analysis, comprising 15 high-resolution videos (totaling 6,247 frames) annotated with three vision tasks: frame-level classification ('clear', 'closed', 'broken', 'blur'), Placido Ring detection, and pixel-wise TFBU area segmentation. Leveraging this dataset, we first propose TF-Net, a novel and efficient baseline segmentation model. TF-Net incorporates a MobileOne-mini backbone with re-parameterization techniques and an enhanced feature pyramid network to achieve a favorable balance between accuracy and computational efficiency for real-time clinical applications. We further establish benchmark performance on the TFM segmentation subset by comparing TF-Net against several state-of-the-art medical image segmentation models. Furthermore, we design TF-Collab, a novel integrated real-time pipeline that synergistically leverages models trained on all three tasks of the TFM dataset. By sequentially orchestrating frame classification for BUT determination, pupil region localization for input standardization, and TFBU segmentation, TF-Collab fully automates the analysis. Experimental results demonstrate the effectiveness of the proposed TF-Net and TF-Collab, providing a foundation for future research in ocular surface diagnostics. Our code and the TFM datasets are available at https://github.com/glory-wan/TF-Net

TFM Dataset: A Novel Multi-task Dataset and Integrated Pipeline for Automated Tear Film Break-Up Segmentation

TL;DR

DES diagnosis relies on objective TFBU analysis, but public multi-task datasets and integrated pipelines were lacking. The authors introduce the Tear Film Multi-task Dataset (TFM) with 6,247 frames from 15 videos annotated for classification, detection, and segmentation, and propose TF-Net as an efficient TFBU segmentation backbone plus TF-Collab as a real-time, multi-task pipeline. They demonstrate state-of-the-art segmentation performance, robust end-to-end processing, and real-time inference on mobile hardware, with rigorous ablations validating design choices such as ROI cropping and class-weighted losses. The work provides a practical foundation for automated, objective ocular surface diagnostics and opens avenues for unified multi-task learning and broader clinical validation.

Abstract

Tear film break-up (TFBU) analysis is critical for diagnosing dry eye syndrome, but automated TFBU segmentation remains challenging due to the lack of annotated datasets and integrated solutions. This paper introduces the Tear Film Multi-task (TFM) Dataset, the first comprehensive dataset for multi-task tear film analysis, comprising 15 high-resolution videos (totaling 6,247 frames) annotated with three vision tasks: frame-level classification ('clear', 'closed', 'broken', 'blur'), Placido Ring detection, and pixel-wise TFBU area segmentation. Leveraging this dataset, we first propose TF-Net, a novel and efficient baseline segmentation model. TF-Net incorporates a MobileOne-mini backbone with re-parameterization techniques and an enhanced feature pyramid network to achieve a favorable balance between accuracy and computational efficiency for real-time clinical applications. We further establish benchmark performance on the TFM segmentation subset by comparing TF-Net against several state-of-the-art medical image segmentation models. Furthermore, we design TF-Collab, a novel integrated real-time pipeline that synergistically leverages models trained on all three tasks of the TFM dataset. By sequentially orchestrating frame classification for BUT determination, pupil region localization for input standardization, and TFBU segmentation, TF-Collab fully automates the analysis. Experimental results demonstrate the effectiveness of the proposed TF-Net and TF-Collab, providing a foundation for future research in ocular surface diagnostics. Our code and the TFM datasets are available at https://github.com/glory-wan/TF-Net

Paper Structure

This paper contains 18 sections, 14 equations, 4 figures, 6 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overview of the Tear Film Multi-task (TFM) Dataset composition, illustrating the distribution and relationships between the three annotation tasks: classification (TF-Cls), object detection (TF-Det), and segmentation (TF-Seg).
  • Figure 2: Sample visualization of the TF-Crop dataset from cropping strategy.. The first row displays the original full-resolution images (left) and their corresponding cropped versions (right), which are generated based on the "Outside" bounding boxes from the TF-Det dataset. The second row presents the visualizations of the pixel-wise TFBU segmentation masks for the respective images above.
  • Figure 3: (a) The overall workflow of the proposed TF-Collab pipeline, which sequentially integrates frame classification, placido rings detection, and TFBU segmentation. (b) The detailed architecture of the proposed TF-Net model, featuring a MobileOne-mini encoder with re-parameterization, a Pyramid Pooling Module (PPM) for multi-scale context, and a decoder with skip connections for boundary refinement. 'MO' means mobileone blockmobileone.
  • Figure 4: Visual comparison of segmentation results on the TF-Crop test set across five scale variants (s0-s4) of MobileOnemobileone or MobileOne-mini(only for TF-Net). The proposed TF-Net demonstrates superior segmentation accuracy with more precise boundary delineation and fewer false positives and negatives compared to other baseline models.