Towards Interpretable Attention Networks for Cervical Cancer Analysis
Ruiqi Wang, Mohammad Ali Armin, Simon Denman, Lars Petersson, David Ahmedt-Aristizabal
TL;DR
This work tackles interpretable cervical cell classification in multi-cell images by contrasting traditional CNNs (ResNet, DenseNet) with attention-based architectures (Residual Attention Networks, Residual Channel Attention). It employs integrated gradients to produce attribution explanations, and evaluates on the SIPaKMeD dataset, revealing that DenseNet-121 with residual channel attention delivers the best performance while yielding focused, region-specific explanations. The study demonstrates that multi-cell context enhances classification and that channel-attention mechanisms can isolate informative cell groups, offering interpretable insights for clinical use. It also points to future directions involving graph-based representations to capture inter-cell relations within multi-cell images.
Abstract
Recent advances in deep learning have enabled the development of automated frameworks for analysing medical images and signals, including analysis of cervical cancer. Many previous works focus on the analysis of isolated cervical cells, or do not offer sufficient methods to explain and understand how the proposed models reach their classification decisions on multi-cell images. Here, we evaluate various state-of-the-art deep learning models and attention-based frameworks for the classification of images of multiple cervical cells. As we aim to provide interpretable deep learning models to address this task, we also compare their explainability through the visualization of their gradients. We demonstrate the importance of using images that contain multiple cells over using isolated single-cell images. We show the effectiveness of the residual channel attention model for extracting important features from a group of cells, and demonstrate this model's efficiency for this classification task. This work highlights the benefits of channel attention mechanisms in analyzing multiple-cell images for potential relations and distributions within a group of cells. It also provides interpretable models to address the classification of cervical cells.
