Table of Contents
Fetching ...

Redefining cystoscopy with ai: bladder cancer diagnosis using an efficient hybrid cnn-transformer model

Meryem Amaouche, Ouassim Karrakchou, Mounir Ghogho, Anouar El Ghazzaly, Mohamed Alami, Ahmed Ameur

TL;DR

Bladder cancer diagnosis via cystoscopy suffers from operator-dependence and missed detections. The authors present a lightweight hybrid CNN-Transformer network with a transformer bottleneck and Dual Attention Gates to achieve accurate, real-time segmentation while maintaining a small model size. They also introduce a hospital-developed cystoscopy dataset and show through ablation that combining DAGs with a single transformer block yields substantial IoU gains (IoU ≈ $85.7\%$) and Dice ≈ $92\%$ with only about $0.36$M parameters. The approach outperforms several CNN-based baselines and remains competitive with larger transformer-based models, making it suitable for real-time clinical deployment and broader accessibility in resource-constrained settings.

Abstract

Bladder cancer ranks within the top 10 most diagnosed cancers worldwide and is among the most expensive cancers to treat due to the high recurrence rates which require lifetime follow-ups. The primary tool for diagnosis is cystoscopy, which heavily relies on doctors' expertise and interpretation. Therefore, annually, numerous cases are either undiagnosed or misdiagnosed and treated as urinary infections. To address this, we suggest a deep learning approach for bladder cancer detection and segmentation which combines CNNs with a lightweight positional-encoding-free transformer and dual attention gates that fuse self and spatial attention for feature enhancement. The architecture suggested in this paper is efficient making it suitable for medical scenarios that require real time inference. Experiments have proven that this model addresses the critical need for a balance between computational efficiency and diagnostic accuracy in cystoscopic imaging as despite its small size it rivals large models in performance.

Redefining cystoscopy with ai: bladder cancer diagnosis using an efficient hybrid cnn-transformer model

TL;DR

Bladder cancer diagnosis via cystoscopy suffers from operator-dependence and missed detections. The authors present a lightweight hybrid CNN-Transformer network with a transformer bottleneck and Dual Attention Gates to achieve accurate, real-time segmentation while maintaining a small model size. They also introduce a hospital-developed cystoscopy dataset and show through ablation that combining DAGs with a single transformer block yields substantial IoU gains (IoU ≈ ) and Dice ≈ with only about M parameters. The approach outperforms several CNN-based baselines and remains competitive with larger transformer-based models, making it suitable for real-time clinical deployment and broader accessibility in resource-constrained settings.

Abstract

Bladder cancer ranks within the top 10 most diagnosed cancers worldwide and is among the most expensive cancers to treat due to the high recurrence rates which require lifetime follow-ups. The primary tool for diagnosis is cystoscopy, which heavily relies on doctors' expertise and interpretation. Therefore, annually, numerous cases are either undiagnosed or misdiagnosed and treated as urinary infections. To address this, we suggest a deep learning approach for bladder cancer detection and segmentation which combines CNNs with a lightweight positional-encoding-free transformer and dual attention gates that fuse self and spatial attention for feature enhancement. The architecture suggested in this paper is efficient making it suitable for medical scenarios that require real time inference. Experiments have proven that this model addresses the critical need for a balance between computational efficiency and diagnostic accuracy in cystoscopic imaging as despite its small size it rivals large models in performance.
Paper Structure (12 sections, 2 equations, 5 figures, 4 tables)

This paper contains 12 sections, 2 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Architecture of the Lightweight CNN-Transformer Network for bladder cancer segmentation.
  • Figure 2: Dual attention gates.
  • Figure 3: Cystoscopy data sample.
  • Figure 4: IoU and Parameters Against Weight Configurations.
  • Figure 5: Visual comparison with UNet, Dilated UNet, Attention UNet, TransUNet and Segformer-(B0,B1) on cystoscopy images.