GLLM: Self-Corrective G-Code Generation using Large Language Models with User Feedback
Mohamed Abdelaal, Samuel Lokadjaja, Gilbert Engert
TL;DR
This work tackles the challenge of converting natural language descriptions into CNC G-code by introducing GLLM, an LLM-based pipeline that combines domain-specific fine-tuning, retrieval-augmented guidance, and a self-corrective generation loop. Core contributions include a parameter-extraction prompted workflow, a multi-stage validation framework using syntactic and semantic checks (guided by Hausdorff distance), and postprocessing with parameter alignment and visualization. The system leverages PEFT and mixed-precision training on a StarCoder-3B backbone, with RAG pulling CNC-domain knowledge to refine outputs. Experimental results across six geometry tasks show that structured prompts and self-correction enable open-source models to approach the performance of proprietary systems, underscoring the potential to democratize CNC programming and accelerate design-to-manufacture cycles.
Abstract
This paper introduces GLLM, an innovative tool that leverages Large Language Models (LLMs) to automatically generate G-code from natural language instructions for Computer Numerical Control (CNC) machining. GLLM addresses the challenges of manual G-code writing by bridging the gap between human-readable task descriptions and machine-executable code. The system incorporates a fine-tuned StarCoder-3B model, enhanced with domain-specific training data and a Retrieval-Augmented Generation (RAG) mechanism. GLLM employs advanced prompting strategies and a novel self-corrective code generation approach to ensure both syntactic and semantic correctness of the generated G-code. The architecture includes robust validation mechanisms, including syntax checks, G-code-specific verifications, and functional correctness evaluations using Hausdorff distance. By combining these techniques, GLLM aims to democratize CNC programming, making it more accessible to users without extensive programming experience while maintaining high accuracy and reliability in G-code generation.
