Table of Contents
Fetching ...

TNF: Tri-branch Neural Fusion for Multimodal Medical Data Classification

Tong Zheng, Shusaku Sone, Yoshitaka Ushiku, Yuki Oba, Jiaxin Ma

Abstract

This paper presents a Tri-branch Neural Fusion (TNF) approach designed for classifying multimodal medical images and tabular data. It also introduces two solutions to address the challenge of label inconsistency in multimodal classification. Traditional methods in multi-modality medical data classification often rely on single-label approaches, typically merging features from two distinct input modalities. This becomes problematic when features are mutually exclusive or labels differ across modalities, leading to reduced accuracy. To overcome this, our TNF approach implements a tri-branch framework that manages three separate outputs: one for image modality, another for tabular modality, and a third hybrid output that fuses both image and tabular data. The final decision is made through an ensemble method that integrates likelihoods from all three branches. We validate the effectiveness of TNF through extensive experiments, which illustrate its superiority over traditional fusion and ensemble methods in various convolutional neural networks and transformer-based architectures across multiple datasets.

TNF: Tri-branch Neural Fusion for Multimodal Medical Data Classification

Abstract

This paper presents a Tri-branch Neural Fusion (TNF) approach designed for classifying multimodal medical images and tabular data. It also introduces two solutions to address the challenge of label inconsistency in multimodal classification. Traditional methods in multi-modality medical data classification often rely on single-label approaches, typically merging features from two distinct input modalities. This becomes problematic when features are mutually exclusive or labels differ across modalities, leading to reduced accuracy. To overcome this, our TNF approach implements a tri-branch framework that manages three separate outputs: one for image modality, another for tabular modality, and a third hybrid output that fuses both image and tabular data. The final decision is made through an ensemble method that integrates likelihoods from all three branches. We validate the effectiveness of TNF through extensive experiments, which illustrate its superiority over traditional fusion and ensemble methods in various convolutional neural networks and transformer-based architectures across multiple datasets.
Paper Structure (43 sections, 14 equations, 11 figures, 8 tables, 2 algorithms)

This paper contains 43 sections, 14 equations, 11 figures, 8 tables, 2 algorithms.

Figures (11)

  • Figure 1: TNF's overall structure. The input are medical image $\boldsymbol{x}_{\rm{i}}$ and corresponding tabular attributes $\boldsymbol{x}_{\rm{t}}$. $\boldsymbol{x}_{\rm{i}}$ and $\boldsymbol{x}_{\rm{t}}$ are inputted into image classification model and tabular classification model respectively to obtain likelihood $\boldsymbol{z}_{\rm{i}}$ and $\boldsymbol{z}_{\rm{t}}$. Image features and tabular features are inputted into fusion model to get likelihood $\boldsymbol{z}_{\rm{f}}$. $\boldsymbol{z}_{\rm{i}}$, $\boldsymbol{z}_{\rm{t}}$ and $\boldsymbol{z}_{\rm{f}}$ are processed by threshold function $\mathcal{G}$ to get the classification result $\hat{y}$.
  • Figure 2: Classification models used in TNF for pulmonary embolism (PE) classification.
  • Figure 3: Models utilized in TNF for cognitive impairment level classification on the NACC dataset.
  • Figure 4: ROC curve of TNF and other methods.
  • Figure 5: Grad-CAM heatmaps (zoom in for better observation).
  • ...and 6 more figures