FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries
Yuqi Jiang, Xudong Lu, Qian Jin, Qi Sun, Hanming Wu, Cheng Zhuo
TL;DR
FabGPT tackles wafer defect knowledge querying by integrating defect detection in SEM imagery with domain-specific Q&A in a domain-adaptive large multimodal framework. Its three-stage pipeline—modal enhancement, pixel-level detection, and Q&A with a modulation-driven prompt system—mitigates modality bias while embedding wafer defect knowledge through corpus training. Empirical results on the SEM-WaD dataset show state-of-the-art defect detection metrics and high Q&A accuracy, underscoring practical benefits for IC manufacturing. This approach provides a blueprint for domain-specific LMMs that balance vision-language reasoning with specialized knowledge, enabling robust defect analysis and actionable process insights.
Abstract
Intelligence is key to advancing integrated circuit (IC) fabrication. Recent breakthroughs in Large Multimodal Models (LMMs) have unlocked extraditionary abilities in understanding images and text, fostering intelligent fabrication. Leveraging the power of LMMs, we introduce FabGPT, a customized IC fabrication large multimodal model for wafer defect knowledge query. FabGPT manifests expertise in conducting defect detection in Scanning Electron Microscope (SEM) images, performing root cause analysis, and providing expert Q&A on fabrication processes. FabGPT matches enhanced multimodal features to automatically detect minute defects under complex wafer backgrounds and reduce the subjectivity of manual threshold settings. Besides, the proposed modulation module and interactive corpus training strategy embed wafer defect knowledge into the pre-trained model, effectively balancing Q&A queries related to defect knowledge and original knowledge and mitigating the modality bias issues. Experiments on in-house fab data show that FabGPT achieves significant performance improvement in wafer defect detection and knowledge querying.
