Detecting Brain Tumors through Multimodal Neural Networks
Antonio Curci, Andrea Esposito
TL;DR
Problem: Brain tumor detection is challenging and high-stakes. Approach: A two-head multimodal DenseNet fuses 240×240×3 MRI image features with 13 tabular descriptors to classify scans as healthy or ill, using stratified 10-fold CV and binary cross-entropy loss. Findings: The method achieves an average accuracy of about $98.8 ext%$ and AUC near $0.99$, with robust performance across folds and a single anomalous fold noted. Significance: This work demonstrates the potential of multimodal data fusion in medical imaging and reinforces the importance of explainability for safer, human-centered deployment in clinical settings.
Abstract
Tumors can manifest in various forms and in different areas of the human body. Brain tumors are specifically hard to diagnose and treat because of the complexity of the organ in which they develop. Detecting them in time can lower the chances of death and facilitate the therapy process for patients. The use of Artificial Intelligence (AI) and, more specifically, deep learning, has the potential to significantly reduce costs in terms of time and resources for the discovery and identification of tumors from images obtained through imaging techniques. This research work aims to assess the performance of a multimodal model for the classification of Magnetic Resonance Imaging (MRI) scans processed as grayscale images. The results are promising, and in line with similar works, as the model reaches an accuracy of around 98\%. We also highlight the need for explainability and transparency to ensure human control and safety.
