Table of Contents
Fetching ...

A novel framework employing deep multi-attention channels network for the autonomous detection of metastasizing cells through fluorescence microscopy

Michail Mamalakis, Sarah C. Macfarlane, Scott V. Notley, Annica K. B Gad, George Panoutsos

TL;DR

This work addresses the challenge of accurately distinguishing normal and metastasizing human cells using fluorescence microscopy images that reveal actin and vimentin organization. It introduces deep multi-attention channel networks (RGB and MHL architectures) paired with Grad-CAM–based local explanations and novel global explainability measures (Gmean-GradCam, Gmean-Shape) to yield high classification performance while preserving biological interpretability. The study demonstrates that attention-enabled models outperform several established DL backbones and reveal biologically meaningful focus on cytoskeletal components, particularly vimentin, with preliminary evidence that transformed cells are more homogeneous. The framework has potential to inform metastasis diagnostics and highlights micrometre-scale vimentin distribution as a prospective diagnostic biomarker, while acknowledging the need for larger cohorts and additional XAI validations.

Abstract

We developed a transparent computational large-scale imaging-based framework that can distinguish between normal and metastasizing human cells. The method relies on fluorescence microscopy images showing the spatial organization of actin and vimentin filaments in normal and metastasizing single cells, using a combination of multi-attention channels network and global explainable techniques. We test a classification between normal cells (Bj primary fibroblast), and their isogenically matched, transformed and invasive counterpart (BjTertSV40TRasV12). Manual annotation is not trivial to automate due to the intricacy of the biologically relevant features. In this research, we utilized established deep learning networks and our new multi-attention channel architecture. To increase the interpretability of the network - crucial for this application area - we developed an interpretable global explainable approach correlating the weighted geometric mean of the total cell images and their local GradCam scores. The significant results from our analysis unprecedently allowed a more detailed, and biologically relevant understanding of the cytoskeletal changes that accompany oncogenic transformation of normal to invasive and metastasizing cells. We also paved the way for a possible spatial micrometre-level biomarker for future development of diagnostic tools against metastasis (spatial distribution of vimentin).

A novel framework employing deep multi-attention channels network for the autonomous detection of metastasizing cells through fluorescence microscopy

TL;DR

This work addresses the challenge of accurately distinguishing normal and metastasizing human cells using fluorescence microscopy images that reveal actin and vimentin organization. It introduces deep multi-attention channel networks (RGB and MHL architectures) paired with Grad-CAM–based local explanations and novel global explainability measures (Gmean-GradCam, Gmean-Shape) to yield high classification performance while preserving biological interpretability. The study demonstrates that attention-enabled models outperform several established DL backbones and reveal biologically meaningful focus on cytoskeletal components, particularly vimentin, with preliminary evidence that transformed cells are more homogeneous. The framework has potential to inform metastasis diagnostics and highlights micrometre-scale vimentin distribution as a prospective diagnostic biomarker, while acknowledging the need for larger cohorts and additional XAI validations.

Abstract

We developed a transparent computational large-scale imaging-based framework that can distinguish between normal and metastasizing human cells. The method relies on fluorescence microscopy images showing the spatial organization of actin and vimentin filaments in normal and metastasizing single cells, using a combination of multi-attention channels network and global explainable techniques. We test a classification between normal cells (Bj primary fibroblast), and their isogenically matched, transformed and invasive counterpart (BjTertSV40TRasV12). Manual annotation is not trivial to automate due to the intricacy of the biologically relevant features. In this research, we utilized established deep learning networks and our new multi-attention channel architecture. To increase the interpretability of the network - crucial for this application area - we developed an interpretable global explainable approach correlating the weighted geometric mean of the total cell images and their local GradCam scores. The significant results from our analysis unprecedently allowed a more detailed, and biologically relevant understanding of the cytoskeletal changes that accompany oncogenic transformation of normal to invasive and metastasizing cells. We also paved the way for a possible spatial micrometre-level biomarker for future development of diagnostic tools against metastasis (spatial distribution of vimentin).
Paper Structure (18 sections, 16 equations, 13 figures, 3 tables)

This paper contains 18 sections, 16 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Representative images of normal and metastasizing cells. (A) Normal (top) and metastasizing (bottom) cells. Scale bar: 1 mm. (B) Single cells, showing nuclei, F-actin, vimentin, and merged images of nuclei (blue), F-actin (green) and vimentin (red). Scale bar: 20 $\mu$ m. Images show representative cells from at least 2 experimental repeats
  • Figure 2: The DenRes-131 network a modified version of the DenResCov-19 state of the art network. The DL family (Vgg-16, DenseNet-121, ResNet-50, DenRes-131).
  • Figure 3: The networks and the multi-head layer. The RGB family (RGB, RGB-Res, RGB-Den). The RGB architecture has three main levels isolator, backbone, multi-head attention. The Isolator is the first level which isolate the three channels of input image (Red, Green, Blue). Following the Backbone level uses established classifiers (convolution networks or ResNet-50 or DenseNet-121) which utilise as input each of the three isolated channels. Lastly, the multi-head attention level uses each output from the backbone network (V,K,Q) to initialise a scaled dot-product attention layer. Six multi-head attention layers are used; three self-attention of each channel (Red, Green, Blue) and three pair channels (Red/Green, Blue/Green, Red/Blue) multi-attention layers. following by a concatenation of them and a multi-layer perceptor to classify the input image.
  • Figure 4: The networks and the multi-head layer. The multihead layer (MHL) family (MHL, MHL-Res, MHL-Den).The MHL architecture has two main levels backbone, multi-head attention. All the three channels of the input are used as a single input in the backbone level and there is no isolated level like in RGB architecture. The Backbone level involves established classifiers (convolution networks or ResNet-50 or DenseNet-121) which use as input the RGB image. The Multi-head attention level uses the output from the backbone level to initialise a scaled dot-product self-attention layer which follows a concatenation and a multi-layer perceptor to classify the input image.
  • Figure 5: The ROC-curves for the established deep learning networks (DL family). With blue line the healthy cells class (1 class) with red line the metastatic cells class (2 class)
  • ...and 8 more figures