Table of Contents
Fetching ...

Masala-CHAI: A Large-Scale SPICE Netlist Dataset for Analog Circuits by Harnessing AI

Jitendra Bhandari, Vineet Bhat, Yuheng He, Hamed Rahmani, Siddharth Garg, Ramesh Karri

TL;DR

Masala-CHAI tackles automated SPICE netlist generation for analog circuits by building the largest open dataset of labeled schematics and SPICE netlists from textbooks. It introduces a multi-stage pipeline combining YOLOv8-based component detection, Deep Hough Transform-based net detection, targeted prompt tuning, and an LLM-based verification loop, enabling end-to-end netlist generation from schematic images. Fine-tuning LLMs on this dataset within the AnalogCoder framework yields substantial improvements in Pass@1 and broad task coverage, including notable gains on challenging benchmarks. The work provides an open-source resource and a scalable approach to accelerate analog circuit design automation.

Abstract

Masala-CHAI is a fully automated framework leveraging large language models (LLMs) to generate Simulation Programs with Integrated Circuit Emphasis (SPICE) netlists. It addresses a long-standing challenge in circuit design automation: automating netlist generation for analog circuits. Automating this workflow could accelerate the creation of fine-tuned LLMs for analog circuit design and verification. In this work, we identify key challenges in automated netlist generation and evaluate multimodal capabilities of state-of-the-art LLMs, particularly GPT-4, in addressing them. We propose a three-step workflow to overcome existing limitations: labeling analog circuits, prompt tuning, and netlist verification. This approach enables end-to-end SPICE netlist generation from circuit schematic images, tackling the persistent challenge of accurate netlist generation. We utilize Masala-CHAI to collect a corpus of 7,500 schematics that span varying complexities in 10 textbooks and benchmark various open source and proprietary LLMs. Models fine-tuned on Masala-CHAI when used in LLM-agentic frameworks such as AnalogCoder achieve a notable 46% improvement in Pass@1 scores. We open-source our dataset and code for community-driven development.

Masala-CHAI: A Large-Scale SPICE Netlist Dataset for Analog Circuits by Harnessing AI

TL;DR

Masala-CHAI tackles automated SPICE netlist generation for analog circuits by building the largest open dataset of labeled schematics and SPICE netlists from textbooks. It introduces a multi-stage pipeline combining YOLOv8-based component detection, Deep Hough Transform-based net detection, targeted prompt tuning, and an LLM-based verification loop, enabling end-to-end netlist generation from schematic images. Fine-tuning LLMs on this dataset within the AnalogCoder framework yields substantial improvements in Pass@1 and broad task coverage, including notable gains on challenging benchmarks. The work provides an open-source resource and a scalable approach to accelerate analog circuit design automation.

Abstract

Masala-CHAI is a fully automated framework leveraging large language models (LLMs) to generate Simulation Programs with Integrated Circuit Emphasis (SPICE) netlists. It addresses a long-standing challenge in circuit design automation: automating netlist generation for analog circuits. Automating this workflow could accelerate the creation of fine-tuned LLMs for analog circuit design and verification. In this work, we identify key challenges in automated netlist generation and evaluate multimodal capabilities of state-of-the-art LLMs, particularly GPT-4, in addressing them. We propose a three-step workflow to overcome existing limitations: labeling analog circuits, prompt tuning, and netlist verification. This approach enables end-to-end SPICE netlist generation from circuit schematic images, tackling the persistent challenge of accurate netlist generation. We utilize Masala-CHAI to collect a corpus of 7,500 schematics that span varying complexities in 10 textbooks and benchmark various open source and proprietary LLMs. Models fine-tuned on Masala-CHAI when used in LLM-agentic frameworks such as AnalogCoder achieve a notable 46% improvement in Pass@1 scores. We open-source our dataset and code for community-driven development.

Paper Structure

This paper contains 20 sections, 1 equation, 12 figures, 4 tables.

Figures (12)

  • Figure 1: Pass@1 performance of GPT-4o fine-tuned with Masala CHAI datasets extracted from between 1-10 textbooks, compared to the no fine-tuning baseline (task IDs from Table II). Our largest dataset of 7500 captioned SPICE netlists extracted from 10 textbooks provides more than 40% Pass@1 accuracy even on challenging tasks like telescoping cascode amplifier and bandgap reference circuit generation.
  • Figure 2: GPT-4o responses when asked to list all components in three sample circuit schematics.
  • Figure 3: Prompt Tuning guides the GPT to differentiate between NMOS and PMOS.
  • Figure 4: SPICE netlist generated by the GPT-4o. For brevity, we only show the part of the netlist that describes transistors.
  • Figure 5: Dataset preparation flow starting from a PDF document consisting of Schematic and related text. Masala-CHAI extracts the relevant information and generates the SPICE netlist for the schematic.
  • ...and 7 more figures