Table of Contents
Fetching ...

Joint Explicit and Implicit Cross-Modal Interaction Network for Anterior Chamber Inflammation Diagnosis

Qian Shao, Ye Dai, Haochao Ying, Kan Xu, Jinhong Wang, Wei Chi, Jian Wu

TL;DR

A joint Explicit and implicit Cross-Modal Interaction Network (EiCI-Net), which uses anterior segment optical coherence tomography images, slit-lamp images, and clinical indicators (tabular data) as input and demonstrates the effectiveness of explicit cross-modal interaction.

Abstract

Uveitis demands the precise diagnosis of anterior chamber inflammation (ACI) for optimal treatment. However, current diagnostic methods only rely on a limited single-modal disease perspective, which leads to poor performance. In this paper, we investigate a promising yet challenging way to fuse multimodal data for ACI diagnosis. Notably, existing fusion paradigms focus on empowering implicit modality interactions (i.e., self-attention and its variants), but neglect to inject explicit modality interactions, especially from clinical knowledge and imaging property. To this end, we propose a jointly Explicit and implicit Cross-Modal Interaction Network (EiCI-Net) for Anterior Chamber Inflammation Diagnosis that uses anterior segment optical coherence tomography (AS-OCT) images, slit-lamp images, and clinical data jointly. Specifically, we first develop CNN-Based Encoders and Tabular Processing Module (TPM) to extract efficient feature representations in different modalities. Then, we devise an Explicit Cross-Modal Interaction Module (ECIM) to generate attention maps as a kind of explicit clinical knowledge based on the tabular feature maps, then integrated them into the slit-lamp feature maps, allowing the CNN-Based Encoder to focus on more effective informativeness of the slit-lamp images. After that, the Implicit Cross-Modal Interaction Module (ICIM), a transformer-based network, further implicitly enhances modality interactions. Finally, we construct a considerable real-world dataset from our collaborative hospital and conduct sufficient experiments to demonstrate the superior performance of our proposed EiCI-Net compared with the state-of-the-art classification methods in various metrics.

Joint Explicit and Implicit Cross-Modal Interaction Network for Anterior Chamber Inflammation Diagnosis

TL;DR

A joint Explicit and implicit Cross-Modal Interaction Network (EiCI-Net), which uses anterior segment optical coherence tomography images, slit-lamp images, and clinical indicators (tabular data) as input and demonstrates the effectiveness of explicit cross-modal interaction.

Abstract

Uveitis demands the precise diagnosis of anterior chamber inflammation (ACI) for optimal treatment. However, current diagnostic methods only rely on a limited single-modal disease perspective, which leads to poor performance. In this paper, we investigate a promising yet challenging way to fuse multimodal data for ACI diagnosis. Notably, existing fusion paradigms focus on empowering implicit modality interactions (i.e., self-attention and its variants), but neglect to inject explicit modality interactions, especially from clinical knowledge and imaging property. To this end, we propose a jointly Explicit and implicit Cross-Modal Interaction Network (EiCI-Net) for Anterior Chamber Inflammation Diagnosis that uses anterior segment optical coherence tomography (AS-OCT) images, slit-lamp images, and clinical data jointly. Specifically, we first develop CNN-Based Encoders and Tabular Processing Module (TPM) to extract efficient feature representations in different modalities. Then, we devise an Explicit Cross-Modal Interaction Module (ECIM) to generate attention maps as a kind of explicit clinical knowledge based on the tabular feature maps, then integrated them into the slit-lamp feature maps, allowing the CNN-Based Encoder to focus on more effective informativeness of the slit-lamp images. After that, the Implicit Cross-Modal Interaction Module (ICIM), a transformer-based network, further implicitly enhances modality interactions. Finally, we construct a considerable real-world dataset from our collaborative hospital and conduct sufficient experiments to demonstrate the superior performance of our proposed EiCI-Net compared with the state-of-the-art classification methods in various metrics.
Paper Structure (21 sections, 6 equations, 3 figures, 4 tables)

This paper contains 21 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Comparison of slit-lamp images (a$\sim$d) and AS-OCT images (e): a. ciliary haemorrhage at the limbus; b. Iris pigmentation; c. AC cells; d. KPs; e. AC cells. Slit-lamp images, despite containing rich information indicative of ACI, challenge automated algorithms to detect AC cells due to their complex backgrounds and the small area of AC cells. Conversely, AC cells are more detectable in AS-OCT images, though without portraying the comprehensive state outside the eyeball state and iris pigmentation.
  • Figure 2: The overall architecture of our proposed EiCI-Net. Ⓒ denotes the concatenate operation, while $\cdot$ denotes multiplying by element.
  • Figure 3: Visualization of EiCI-Net w or w/o ECIM. a. the pupil is not round; b$\sim$c. AC cells; d. the vitreous body and lens are turbid.