Table of Contents
Fetching ...

FOAA: Flattened Outer Arithmetic Attention For Multimodal Tumor Classification

Omnia Alwazzan, Ioannis Patras, Gregory Slabaugh

TL;DR

FOAA introduces a cross-attention fusion framework that replaces the standard scaled-dot-product with four outer-operator components—outer addition $\oplus$, outer subtraction $\ominus$, outer product $\otimes$, and outer division $\oslash$—applied to flattened, $64$-dimensional modality embeddings. By operating on flattened vectors, FOAA enables dense intermixing of features across imaging and non-imaging modalities, improving discriminative fusion for tumor classification. The approach is validated on CMMD breast tumor data and TCGA brain tumor data, achieving state-of-the-art results and demonstrating robustness on imbalanced datasets, with ablations showing consistent gains from incorporating more FOAA components. FOAA is presented as a simple, reusable block that can be integrated into diverse neural architectures and extended to additional modalities with minimal changes.

Abstract

Fusion of multimodal healthcare data holds great promise to provide a holistic view of a patient's health, taking advantage of the complementarity of different modalities while leveraging their correlation. This paper proposes a simple and effective approach, inspired by attention, to fuse discriminative features from different modalities. We propose a novel attention mechanism, called Flattened Outer Arithmetic Attention (FOAA), which relies on outer arithmetic operators (addition, subtraction, product, and division) to compute attention scores from keys, queries and values derived from flattened embeddings of each modality. We demonstrate how FOAA can be implemented for self-attention and cross-attention, providing a reusable component in neural network architectures. We evaluate FOAA on two datasets for multimodal tumor classification and achieve state-of-the-art results, and we demonstrate that features enriched by FOAA are superior to those derived from other fusion approaches. The code is publicly available at \href{https://github.com/omniaalwazzan/FOAA}{https://github.com/omniaalwazzan/FOAA}

FOAA: Flattened Outer Arithmetic Attention For Multimodal Tumor Classification

TL;DR

FOAA introduces a cross-attention fusion framework that replaces the standard scaled-dot-product with four outer-operator components—outer addition , outer subtraction , outer product , and outer division —applied to flattened, -dimensional modality embeddings. By operating on flattened vectors, FOAA enables dense intermixing of features across imaging and non-imaging modalities, improving discriminative fusion for tumor classification. The approach is validated on CMMD breast tumor data and TCGA brain tumor data, achieving state-of-the-art results and demonstrating robustness on imbalanced datasets, with ablations showing consistent gains from incorporating more FOAA components. FOAA is presented as a simple, reusable block that can be integrated into diverse neural architectures and extended to additional modalities with minimal changes.

Abstract

Fusion of multimodal healthcare data holds great promise to provide a holistic view of a patient's health, taking advantage of the complementarity of different modalities while leveraging their correlation. This paper proposes a simple and effective approach, inspired by attention, to fuse discriminative features from different modalities. We propose a novel attention mechanism, called Flattened Outer Arithmetic Attention (FOAA), which relies on outer arithmetic operators (addition, subtraction, product, and division) to compute attention scores from keys, queries and values derived from flattened embeddings of each modality. We demonstrate how FOAA can be implemented for self-attention and cross-attention, providing a reusable component in neural network architectures. We evaluate FOAA on two datasets for multimodal tumor classification and achieve state-of-the-art results, and we demonstrate that features enriched by FOAA are superior to those derived from other fusion approaches. The code is publicly available at \href{https://github.com/omniaalwazzan/FOAA}{https://github.com/omniaalwazzan/FOAA}
Paper Structure (13 sections, 6 equations, 2 figures, 3 tables)

This paper contains 13 sections, 6 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The Flattened Outer Arithmetic Attention (FOAA) mechanism is implemented for cross-attention between the image and gene expression modalities occurring in the black box using outer product $\otimes$, outer subtraction $\ominus$, outer division $\oslash$ and outer addition $\oplus$. The resultant attention matrices are applied to the value vector and then integrated with an element-wise sum. This is followed by a fully connected (FC) layer prior to the final classifier.
  • Figure 2: t-Distributed Stochastic Neighbor Embedding (t-SNE) visualization for FOAA model. (a) represents FOAA for the CMMD data, (b) replication of MOAB alwazzan2023moab on CMMD, (c) FOAA on TCGA, and (d) shows the AUC FOAA in comparison with two ablation studies.