Table of Contents
Fetching ...

QMViT: A Mushroom is worth 16x16 Words

Siddhant Dutta, Hemant Singh, Kalpita Shankhdhar, Sridhar Iyer

TL;DR

The paper addresses the critical problem of distinguishing edible from poisonous mushrooms using a hybrid quantum-classical approach. It introduces QMViT, a Hybrid Quantum-Classical Vision Transformer that leverages variational quantum circuits and angle-encoded data to enhance image-based edibility classification, reporting 92.33% accuracy with 4 qubits and 1 layer, and up to 99.24% edibility accuracy on broader datasets in comparison to ViT. The methodology blends classical ViT principles with quantum self-attention and quanvolutional concepts, evaluated against benchmarks including NuSVC, XGBoost, CNNs, and transfer-learning ViTs. The results indicate potential quantum advantages in high-dimensional, imbalanced mushroom datasets and stress the need for further work to scale quantum hardware and integrate such models into real-world safety workflows.

Abstract

Consuming poisonous mushrooms can have severe health consequences, even resulting in fatality and accurately distinguishing edible from toxic mushroom varieties remains a significant challenge in ensuring food safety. So, it's crucial to distinguish between edible and poisonous mushrooms within the existing species. This is essential due to the significant demand for mushrooms in people's daily meals and their potential contributions to medical science. This work presents a novel Quantum Vision Transformer architecture that leverages quantum computing to enhance mushroom classification performance. By implementing specialized quantum self-attention mechanisms using Variational Quantum Circuits, the proposed architecture achieved 92.33% and 99.24% accuracy based on their category and their edibility respectively. This demonstrates the success of the proposed architecture in reducing false negatives for toxic mushrooms, thus ensuring food safety. Our research highlights the potential of QMViT for improving mushroom classification as a whole.

QMViT: A Mushroom is worth 16x16 Words

TL;DR

The paper addresses the critical problem of distinguishing edible from poisonous mushrooms using a hybrid quantum-classical approach. It introduces QMViT, a Hybrid Quantum-Classical Vision Transformer that leverages variational quantum circuits and angle-encoded data to enhance image-based edibility classification, reporting 92.33% accuracy with 4 qubits and 1 layer, and up to 99.24% edibility accuracy on broader datasets in comparison to ViT. The methodology blends classical ViT principles with quantum self-attention and quanvolutional concepts, evaluated against benchmarks including NuSVC, XGBoost, CNNs, and transfer-learning ViTs. The results indicate potential quantum advantages in high-dimensional, imbalanced mushroom datasets and stress the need for further work to scale quantum hardware and integrate such models into real-world safety workflows.

Abstract

Consuming poisonous mushrooms can have severe health consequences, even resulting in fatality and accurately distinguishing edible from toxic mushroom varieties remains a significant challenge in ensuring food safety. So, it's crucial to distinguish between edible and poisonous mushrooms within the existing species. This is essential due to the significant demand for mushrooms in people's daily meals and their potential contributions to medical science. This work presents a novel Quantum Vision Transformer architecture that leverages quantum computing to enhance mushroom classification performance. By implementing specialized quantum self-attention mechanisms using Variational Quantum Circuits, the proposed architecture achieved 92.33% and 99.24% accuracy based on their category and their edibility respectively. This demonstrates the success of the proposed architecture in reducing false negatives for toxic mushrooms, thus ensuring food safety. Our research highlights the potential of QMViT for improving mushroom classification as a whole.
Paper Structure (30 sections, 11 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 30 sections, 11 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Distribution of labels within the multi-class dataset
  • Figure 2: Decision Boundary of SVC
  • Figure 3: Convolution Operation
  • Figure 4: Quanvolutional Neural Network
  • Figure 5: The Quantum circuit utilized in the QMViT architecture constructed with H, RX, and CNOT gates
  • ...and 3 more figures