A Unified Multimodal Framework for Dataset Construction and Model-Based Diagnosis of Ameloblastoma
Ajo Babu George, Anna Mariam John, Athul Anoop, Balu Bhasuran
TL;DR
The paper tackles the scarcity of high-quality, multimodal ameloblastoma datasets by introducing a unified framework for curating radiology, histopathology, and clinical images with structured textual data. It combines NLP-driven data extraction, image preprocessing, and a multimodal deep learning model to classify variants and predict recurrence risk and surgical considerations, while gracefully handling deployment with structured clinical inputs. A case-based retrieval system is developed using a SBERT+FAISS pipeline to support clinical decision making, and extensive benchmarking demonstrates improvements in retrieval quality and robustness. Overall, the work delivers a practical, adaptable platform that advances patient-specific decision support in ameloblastoma while highlighting limitations such as rare variant representation and the need for external validation and workflow integration.
Abstract
Artificial intelligence (AI)-enabled diagnostics in maxillofacial pathology require structured, high-quality multimodal datasets. However, existing resources provide limited ameloblastoma coverage and lack the format consistency needed for direct model training. We present a newly curated multimodal dataset specifically focused on ameloblastoma, integrating annotated radiological, histopathological, and intraoral clinical images with structured data derived from case reports. Natural language processing techniques were employed to extract clinically relevant features from textual reports, while image data underwent domain specific preprocessing and augmentation. Using this dataset, a multimodal deep learning model was developed to classify ameloblastoma variants, assess behavioral patterns such as recurrence risk, and support surgical planning. The model is designed to accept clinical inputs such as presenting complaint, age, and gender during deployment to enhance personalized inference. Quantitative evaluation demonstrated substantial improvements; variant classification accuracy increased from 46.2 percent to 65.9 percent, and abnormal tissue detection F1-score improved from 43.0 percent to 90.3 percent. Benchmarked against resources like MultiCaRe, this work advances patient-specific decision support by providing both a robust dataset and an adaptable multimodal AI framework.
