ORAssistant: A Custom RAG-based Conversational Assistant for OpenROAD
Aviral Kaintura, Palaniappan R, Shui Song Luar, Indira Iyer Almeida
TL;DR
The paper tackles the challenge of making complex open-source EDA workflows, such as OpenROAD, more approachable by combining a Retrieval-Augmented Generation (RAG) framework with domain-specific retriever tools. It constructs a knowledge base from OpenROAD documentation, GitHub discussions, and other open sources, then ingests and indexes this data using SBERT embeddings, FAISS vectors, and BM25 for exact matching. A modular, domain-aware retriever architecture retrieves relevant documents and a two-stage LLM prompting flow generates citations-rich responses, enabling context-aware conversations across installation, setup, and flow execution. Evaluation using GPTScore on the EDA Corpus and HumanEval shows ORAssistant outperforms base pre-trained LLMs (e.g., GPT-4o and Gemini 1.5 Flash) in accuracy, precision, recall, and overall LLMScore, with faster response times, underscoring the practicality of tool-based RAG for open-source EDA ecosystems. The work demonstrates a scalable path to integrating additional tools and data sources, potentially broadening access to ASIC design workflows and accelerating learning across user expertise levels.
Abstract
Open-source Electronic Design Automation (EDA) tools are rapidly transforming chip design by addressing key barriers of commercial EDA tools such as complexity, costs, and access. Recent advancements in Large Language Models (LLMs) have further enhanced efficiency in chip design by providing user assistance across a range of tasks like setup, decision-making, and flow automation. This paper introduces ORAssistant, a conversational assistant for OpenROAD, based on Retrieval-Augmented Generation (RAG). ORAssistant aims to improve the user experience for the OpenROAD flow, from RTL-GDSII by providing context-specific responses to common user queries, including installation, command usage, flow setup, and execution, in prose format. Currently, ORAssistant integrates OpenROAD, OpenROAD-flow-scripts, Yosys, OpenSTA, and KLayout. The data model is built from publicly available documentation and GitHub resources. The proposed architecture is scalable, supporting extensions to other open-source tools, operating modes, and LLM models. We use Google Gemini as the base LLM model to build and test ORAssistant. Early evaluation results of the RAG-based model show notable improvements in performance and accuracy compared to non-fine-tuned LLMs.
