Table of Contents
Fetching ...

PICBench: Benchmarking LLMs for Photonic Integrated Circuits Design

Yuchao Wu, Xiaofei Yu, Hao Chen, Yang Luo, Yeyu Tong, Yuzhe Ma

TL;DR

PICBench addresses the lack of a reliable evaluation framework for LLM-assisted photonic integrated circuit design by introducing a 24-problem, open-source benchmark evaluated with a SAX-based simulator. It presents an end-to-end workflow that combines natural-language prompts, automatic syntax and functionality checks, and an error-classification feedback loop to iteratively refine netlist generation. Across multiple commercial LLMs, the study shows that error feedback and restriction-driven prompting significantly improve design correctness, illustrating both the promise and remaining challenges of automating PIC design with LLMs. The resource enables standardized, repeatable assessment and fosters progress in photonic design automation using language models.

Abstract

While large language models (LLMs) have shown remarkable potential in automating various tasks in digital chip design, the field of Photonic Integrated Circuits (PICs)-a promising solution to advanced chip designs-remains relatively unexplored in this context. The design of PICs is time-consuming and prone to errors due to the extensive and repetitive nature of code involved in photonic chip design. In this paper, we introduce PICBench, the first benchmarking and evaluation framework specifically designed to automate PIC design generation using LLMs, where the generated output takes the form of a netlist. Our benchmark consists of dozens of meticulously crafted PIC design problems, spanning from fundamental device designs to more complex circuit-level designs. It automatically evaluates both the syntax and functionality of generated PIC designs by comparing simulation outputs with expert-written solutions, leveraging an open-source simulator. We evaluate a range of existing LLMs, while also conducting comparative tests on various prompt engineering techniques to enhance LLM performance in automated PIC design. The results reveal the challenges and potential of LLMs in the PIC design domain, offering insights into the key areas that require further research and development to optimize automation in this field. Our benchmark and evaluation code is available at https://github.com/PICDA/PICBench.

PICBench: Benchmarking LLMs for Photonic Integrated Circuits Design

TL;DR

PICBench addresses the lack of a reliable evaluation framework for LLM-assisted photonic integrated circuit design by introducing a 24-problem, open-source benchmark evaluated with a SAX-based simulator. It presents an end-to-end workflow that combines natural-language prompts, automatic syntax and functionality checks, and an error-classification feedback loop to iteratively refine netlist generation. Across multiple commercial LLMs, the study shows that error feedback and restriction-driven prompting significantly improve design correctness, illustrating both the promise and remaining challenges of automating PIC design with LLMs. The resource enables standardized, repeatable assessment and fosters progress in photonic design automation using language models.

Abstract

While large language models (LLMs) have shown remarkable potential in automating various tasks in digital chip design, the field of Photonic Integrated Circuits (PICs)-a promising solution to advanced chip designs-remains relatively unexplored in this context. The design of PICs is time-consuming and prone to errors due to the extensive and repetitive nature of code involved in photonic chip design. In this paper, we introduce PICBench, the first benchmarking and evaluation framework specifically designed to automate PIC design generation using LLMs, where the generated output takes the form of a netlist. Our benchmark consists of dozens of meticulously crafted PIC design problems, spanning from fundamental device designs to more complex circuit-level designs. It automatically evaluates both the syntax and functionality of generated PIC designs by comparing simulation outputs with expert-written solutions, leveraging an open-source simulator. We evaluate a range of existing LLMs, while also conducting comparative tests on various prompt engineering techniques to enhance LLM performance in automated PIC design. The results reveal the challenges and potential of LLMs in the PIC design domain, offering insights into the key areas that require further research and development to optimize automation in this field. Our benchmark and evaluation code is available at https://github.com/PICDA/PICBench.

Paper Structure

This paper contains 16 sections, 1 equation, 4 figures, 4 tables.

Figures (4)

  • Figure 1: PICBench framework that automated design generation and evaluation.
  • Figure 2: Example of problem description.
  • Figure 3: System prompt template for code generation.
  • Figure 4: An example of solving MZI_ps by GPT o1-mini with feedback