In-Context-Learning-Assisted Quality Assessment Vision-Language Models for Metal Additive Manufacturing
Qiaojie Zheng, Jiucai Zhang, Xiaoli Zhang
TL;DR
This work presents in-context learning (ICL) with vision-language models (VLMs) to perform QA on metal additive manufacturing prints without large application-specific datasets. By evaluating six context-sampling strategies across two VLMs (Gemini 2.5 Flash and Gemma 3:27b) and introducing knowledge relevance, rationale validity, and conclusion correctness as evaluative metrics, the study demonstrates that ICL can achieve ML-comparable accuracy while producing human-interpretable rationales. The results reveal model-dependent preferences for context: larger models benefit from many-shot, diverse prompts, whereas smaller models excel with balanced, diverse few-shot prompts. This approach reduces data collection burdens, enhances decision transparency, and provides a framework for evaluating rationale quality in manufacturing QA.
Abstract
Vision-based quality assessment in additive manufacturing often requires dedicated machine learning models and application-specific datasets. However, data collection and model training can be expensive and time-consuming. In this paper, we leverage vision-language models' (VLMs') reasoning capabilities to assess the quality of printed parts and introduce in-context learning (ICL) to provide VLMs with necessary application-specific knowledge and demonstration samples. This method eliminates the requirement for large application-specific datasets for training models. We explored different sampling strategies for ICL to search for the optimal configuration that makes use of limited samples. We evaluated these strategies on two VLMs, Gemini-2.5-flash and Gemma3:27b, with quality assessment tasks in wire-laser direct energy deposition processes. The results show that ICL-assisted VLMs can reach quality classification accuracies similar to those of traditional machine learning models while requiring only a minimal number of samples. In addition, unlike traditional classification models that lack transparency, VLMs can generate human-interpretable rationales to enhance trust. Since there are no metrics to evaluate their interpretability in manufacturing applications, we propose two metrics, knowledge relevance and rationale validity, to evaluate the quality of VLMs' supporting rationales. Our results show that ICL-assisted VLMs can address application-specific tasks with limited data, achieving relatively high accuracy while also providing valid supporting rationales for improved decision transparency.
