Customized Retrieval Augmented Generation and Benchmarking for EDA Tool Documentation QA
Yuan Pu, Zhuolun He, Tairu Qiu, Haoyuan Wu, Bei Yu
TL;DR
This work addresses the challenge of applying generic retrieval augmented generation to knowledge-intensive EDA tool documentation by proposing RAG-EDA, a domain-tailored pipeline with three specialized components: domain-customized embedding via contrastive learning, a contrastively fine-tuned reranker, and a two-stage domain-specific LLM generator. It introduces ORD-QA, a 90-question, OpenROAD-based benchmark, to rigorously evaluate retrieval, reranking, and generation in EDA contexts and demonstrates superior performance over state-of-the-art baselines on ORD-QA and a commercial tool. The approach combines hybrid lexical-semantic retrieval, GPT-4 guided reranker supervision, and careful domain pre-training and instruction tuning to produce accurate, domain-consistent QA outputs. The work provides concrete open-source resources (ORD-QA and training data) that enable reproducibility and future research in EDA tool documentation QA, with practical implications for reducing manual support costs in EDA workflows.
Abstract
Retrieval augmented generation (RAG) enhances the accuracy and reliability of generative AI models by sourcing factual information from external databases, which is extensively employed in document-grounded question-answering (QA) tasks. Off-the-shelf RAG flows are well pretrained on general-purpose documents, yet they encounter significant challenges when being applied to knowledge-intensive vertical domains, such as electronic design automation (EDA). This paper addresses such issue by proposing a customized RAG framework along with three domain-specific techniques for EDA tool documentation QA, including a contrastive learning scheme for text embedding model fine-tuning, a reranker distilled from proprietary LLM, and a generative LLM fine-tuned with high-quality domain corpus. Furthermore, we have developed and released a documentation QA evaluation benchmark, ORD-QA, for OpenROAD, an advanced RTL-to-GDSII design platform. Experimental results demonstrate that our proposed RAG flow and techniques have achieved superior performance on ORD-QA as well as on a commercial tool, compared with state-of-the-arts. The ORD-QA benchmark and the training dataset for our customized RAG flow are open-source at https://github.com/lesliepy99/RAG-EDA.
