Table of Contents
Fetching ...

Enhancing Brain Tumor Classification Using Vision Transformers with Colormap-Based Feature Representation on BRISC2025 Dataset

Faisal Ahmed

Abstract

Accurate classification of brain tumors from magnetic resonance imaging (MRI) plays a critical role in early diagnosis and effective treatment planning. In this study, we propose a deep learning framework based on Vision Transformers (ViT) enhanced with colormap-based feature representation to improve multi-class brain tumor classification performance. The proposed approach leverages the ability of transformer architectures to capture long-range dependencies while incorporating color mapping techniques to emphasize important structural and intensity variations within MRI scans. Experiments are conducted on the BRISC2025 dataset, which includes four classes: glioma, meningioma, pituitary tumor, and non-tumor cases. The model is trained and evaluated using standard performance metrics such as accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC). The proposed method achieves a classification accuracy of 98.90%, outperforming baseline convolutional neural network models including ResNet50, ResNet101, and EfficientNetB2. In addition, the model demonstrates strong generalization capability with an AUC of 99.97%, indicating high discriminative performance across all classes. These results highlight the effectiveness of combining Vision Transformers with colormap-based feature enhancement for accurate and robust brain tumor classification and suggest strong potential for clinical decision support applications.

Enhancing Brain Tumor Classification Using Vision Transformers with Colormap-Based Feature Representation on BRISC2025 Dataset

Abstract

Accurate classification of brain tumors from magnetic resonance imaging (MRI) plays a critical role in early diagnosis and effective treatment planning. In this study, we propose a deep learning framework based on Vision Transformers (ViT) enhanced with colormap-based feature representation to improve multi-class brain tumor classification performance. The proposed approach leverages the ability of transformer architectures to capture long-range dependencies while incorporating color mapping techniques to emphasize important structural and intensity variations within MRI scans. Experiments are conducted on the BRISC2025 dataset, which includes four classes: glioma, meningioma, pituitary tumor, and non-tumor cases. The model is trained and evaluated using standard performance metrics such as accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC). The proposed method achieves a classification accuracy of 98.90%, outperforming baseline convolutional neural network models including ResNet50, ResNet101, and EfficientNetB2. In addition, the model demonstrates strong generalization capability with an AUC of 99.97%, indicating high discriminative performance across all classes. These results highlight the effectiveness of combining Vision Transformers with colormap-based feature enhancement for accurate and robust brain tumor classification and suggest strong potential for clinical decision support applications.
Paper Structure (15 sections, 17 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 15 sections, 17 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Representative MRI samples from the BRISC2025 dataset showing the four classification categories: non-tumor, pituitary tumor, meningioma tumor, and glioma tumor. All images are resized to a uniform spatial resolution for consistent model input.
  • Figure 2: Proposed colormap-enhanced Vision Transformer (ViT) framework for brain tumor classification. The pipeline begins with a grayscale MRI image, which is transformed into a pseudo-color representation to enhance structural patterns and intensity variations. The color-enhanced image is resized to a fixed resolution and divided into non-overlapping patches. Each patch is flattened and linearly projected into a latent embedding space, followed by the addition of positional embeddings and a learnable classification token. The resulting sequence of embedded patches is processed through multiple Transformer encoder layers consisting of multi-head self-attention and feed-forward networks. The final representation of the classification token is passed to a multilayer perceptron (MLP) head to generate the probability distribution over the four tumor classes.
  • Figure 3: Evaluation of TDA+DenseNet121 on the OASIS-1 dataset. (a) One-vs-rest ROC curves illustrating discrimination performance across the four Alzheimer’s disease classes. (b) Confusion matrix showing correct and misclassified instances for each class.