Table of Contents
Fetching ...

Multimodal RGB-HSI Feature Fusion with Patient-Aware Incremental Heuristic Meta-Learning for Oral Lesion Classification

Rupam Mukherjee, Rajkumar Daniel, Soujanya Hazra, Shirin Dasgupta, Subhamoy Mandal

TL;DR

This work tackles the challenge of early oral lesion screening with limited labeled data by proposing a four-class classifier that fuses RGB deep representations, hyperspectral information reconstructed to 31 bands, handcrafted spectral-textural descriptors, and demographic metadata. A pathologist-filtered dataset supports reliable processing, and an Incremental Heuristic Meta-Learner (IHML) combines calibrated base models with uncertainty-aware meta-features and patient-wise posterior smoothing to improve robustness on unseen patients. The method demonstrates that spectral cues and clinical priors substantially boost performance, achieving a Macro F1 of 66.23% and AUROC of 84.45% on an unseen test set, outperforming traditional baselines. This multimodal framework advances real-world oral lesion screening, with potential for end-to-end fusion and deployment using actual hyperspectral acquisition in clinical settings.

Abstract

Early detection of oral cancer and potentially malignant disorders is challenging in low-resource settings due to limited annotated data. We present a unified four-class oral lesion classifier that integrates deep RGB embeddings, hyperspectral reconstruction, handcrafted spectral-textural descriptors, and demographic metadata. A pathologist-verified subset of oral cavity images was curated and processed using a fine-tuned ConvNeXt-v2 encoder, followed by RGB-to-HSI reconstruction into 31-band hyperspectral cubes. Haemoglobin-sensitive indices, texture features, and spectral-shape measures were extracted and fused with deep and clinical features. Multiple machine-learning models were assessed with patient-wise validation. We further introduce an incremental heuristic meta-learner (IHML) that combines calibrated base classifiers through probabilistic stacking and patient-level posterior smoothing. On an unseen patient split, the proposed framework achieved a macro F1 of 66.23% and an accuracy of 64.56%. Results demonstrate that hyperspectral reconstruction and uncertainty-aware meta-learning substantially improve robustness for real-world oral lesion screening.

Multimodal RGB-HSI Feature Fusion with Patient-Aware Incremental Heuristic Meta-Learning for Oral Lesion Classification

TL;DR

This work tackles the challenge of early oral lesion screening with limited labeled data by proposing a four-class classifier that fuses RGB deep representations, hyperspectral information reconstructed to 31 bands, handcrafted spectral-textural descriptors, and demographic metadata. A pathologist-filtered dataset supports reliable processing, and an Incremental Heuristic Meta-Learner (IHML) combines calibrated base models with uncertainty-aware meta-features and patient-wise posterior smoothing to improve robustness on unseen patients. The method demonstrates that spectral cues and clinical priors substantially boost performance, achieving a Macro F1 of 66.23% and AUROC of 84.45% on an unseen test set, outperforming traditional baselines. This multimodal framework advances real-world oral lesion screening, with potential for end-to-end fusion and deployment using actual hyperspectral acquisition in clinical settings.

Abstract

Early detection of oral cancer and potentially malignant disorders is challenging in low-resource settings due to limited annotated data. We present a unified four-class oral lesion classifier that integrates deep RGB embeddings, hyperspectral reconstruction, handcrafted spectral-textural descriptors, and demographic metadata. A pathologist-verified subset of oral cavity images was curated and processed using a fine-tuned ConvNeXt-v2 encoder, followed by RGB-to-HSI reconstruction into 31-band hyperspectral cubes. Haemoglobin-sensitive indices, texture features, and spectral-shape measures were extracted and fused with deep and clinical features. Multiple machine-learning models were assessed with patient-wise validation. We further introduce an incremental heuristic meta-learner (IHML) that combines calibrated base classifiers through probabilistic stacking and patient-level posterior smoothing. On an unseen patient split, the proposed framework achieved a macro F1 of 66.23% and an accuracy of 64.56%. Results demonstrate that hyperspectral reconstruction and uncertainty-aware meta-learning substantially improve robustness for real-world oral lesion screening.

Paper Structure

This paper contains 12 sections, 4 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Illustration of the proposed multimodal pipeline combines RGB deep embeddings, hyperspectral features, and demographic metadata to predict oral lesions at the patient level using IHML.