Table of Contents
Fetching ...

QuPAINT: Physics-Aware Instruction Tuning Approach to Quantum Material Discovery

Xuan-Bac Nguyen, Hoang-Quan Nguyen, Sankalp Pandey, Tim Faltermeier, Nicholas Borys, Hugh Churchill, Khoa Luu

TL;DR

This work presents a new physics-aware multimodal framework that addresses limitations in two-dimensional quantum materials from optical microscopy images, and proposes Physics-Aware Instruction Tuning (QuPAINT), a multimodal architecture that incorporates a Physics-Informed Attention module to fuse visual embeddings with optical priors, enabling more robust and discriminative flake representations.

Abstract

Characterizing two-dimensional quantum materials from optical microscopy images is challenging due to the subtle layer-dependent contrast, limited labeled data, and significant variation across laboratories and imaging setups. Existing vision models struggle in this domain since they lack physical priors and cannot generalize to new materials or hardware conditions. This work presents a new physics-aware multimodal framework that addresses these limitations from both the data and model perspectives. We first present Synthia, a physics-based synthetic data generator that simulates realistic optical responses of quantum material flakes under thin-film interference. Synthia produces diverse and high-quality samples, helping reduce the dependence on expert manual annotation. We introduce QMat-Instruct, the first large-scale instruction dataset for quantum materials, comprising multimodal, physics-informed question-answer pairs designed to teach Multimodal Large Language Models (MLLMs) to understand the appearance and thickness of flakes. Then, we propose Physics-Aware Instruction Tuning (QuPAINT), a multimodal architecture that incorporates a Physics-Informed Attention module to fuse visual embeddings with optical priors, enabling more robust and discriminative flake representations. Finally, we establish QF-Bench, a comprehensive benchmark spanning multiple materials, substrates, and imaging settings, offering standardized protocols for fair and reproducible evaluation.

QuPAINT: Physics-Aware Instruction Tuning Approach to Quantum Material Discovery

TL;DR

This work presents a new physics-aware multimodal framework that addresses limitations in two-dimensional quantum materials from optical microscopy images, and proposes Physics-Aware Instruction Tuning (QuPAINT), a multimodal architecture that incorporates a Physics-Informed Attention module to fuse visual embeddings with optical priors, enabling more robust and discriminative flake representations.

Abstract

Characterizing two-dimensional quantum materials from optical microscopy images is challenging due to the subtle layer-dependent contrast, limited labeled data, and significant variation across laboratories and imaging setups. Existing vision models struggle in this domain since they lack physical priors and cannot generalize to new materials or hardware conditions. This work presents a new physics-aware multimodal framework that addresses these limitations from both the data and model perspectives. We first present Synthia, a physics-based synthetic data generator that simulates realistic optical responses of quantum material flakes under thin-film interference. Synthia produces diverse and high-quality samples, helping reduce the dependence on expert manual annotation. We introduce QMat-Instruct, the first large-scale instruction dataset for quantum materials, comprising multimodal, physics-informed question-answer pairs designed to teach Multimodal Large Language Models (MLLMs) to understand the appearance and thickness of flakes. Then, we propose Physics-Aware Instruction Tuning (QuPAINT), a multimodal architecture that incorporates a Physics-Informed Attention module to fuse visual embeddings with optical priors, enabling more robust and discriminative flake representations. Finally, we establish QF-Bench, a comprehensive benchmark spanning multiple materials, substrates, and imaging settings, offering standardized protocols for fair and reproducible evaluation.
Paper Structure (27 sections, 23 equations, 16 figures, 5 tables, 2 algorithms)

This paper contains 27 sections, 23 equations, 16 figures, 5 tables, 2 algorithms.

Figures (16)

  • Figure 1: Overview of our proposed system. Synthia generates physics-based synthetic flakes, and QMat-Instruct provides multimodal, physics-aware supervision. Together, they enable the new Physics-Aware Instruction Tuning for Quantum Material Discovery (QuPAINT) framework, to learn robust and interpretable representations for quantum material characterization. (Best view in colors)
  • Figure 2: Overview of the real data collection pipeline and challenges in quantum flake characterization. (a) Microscopy enables fast collection of thousands of flakes but provides no thickness information, while AFM offers accurate layer measurements but is slow and not scalable. (b) The required manual workflow is labor-intensive, involving repeated chip transfers and flake verification. (c) Real samples from multiple materials (MoS$_2$, h-BN, Graphene, WTe$_2$) show minimal visual differences between 1, 2, ' 3 layers, illustrating the difficulty of determining thickness from raw optical images. (Best view in colors)
  • Figure 3: (a) Optical thin-film model used in Synthia, capturing layer-dependent interference between the material (e.g., MoS$_2$), SiO$_2$, and Si substrate under microscope illumination. (b) Comparison of synthetic flake quality, showing that Synthia produces more realistic color contrast, edge appearance, and flake visibility across materials and thicknesses than previous approaches. (Best view in colors)
  • Figure 4: Overview of the proposed QuPAINT framework. A Physics-Informed Attention (PIA) module extracts optical cues from the microscopy image, which are fused with ViT visual embeddings and text tokens in a multimodal large-language model for quantum material understanding.
  • Figure 5: Physics-Informed Attention Map. Best view in colors.
  • ...and 11 more figures