Chat to Chip: Large Language Model Based Design of Arbitrarily Shaped Metasurfaces
Huanshu Zhang, Lei Kang, Sawyer D. Campbell, Douglas H. Werner
TL;DR
The paper addresses the computational bottleneck in designing arbitrarily shaped metasurfaces by introducing a chat-to-chip workflow that fine-tunes a one-dimensional token-wise LLM (Meta-Llama-3.1-8B-Instruct) with LoRA on a dataset of geometry-spectrum pairs. It demonstrates that forward predictions of transmission spectra can be obtained with high fidelity and substantial speedups over full-wave solvers, while benchmarking across open-weight LLMs identifies practical, cost-efficient models for rapid prototyping. For inverse design, the approach exploits the stochasticity of LLMs to generate diverse unit-cell geometries that realize target spectra with very low error, outperforming traditional tandem networks. Collectively, the work offers a turnkey, code-free path to data-driven nanophotonics and links natural language prompts to electromagnetic modeling to accelerate metasurface exploration.
Abstract
Traditional metasurface design is limited by the computational cost of full-wave simulations, preventing thorough exploration of complex configurations. Data-driven approaches have emerged as a solution to this bottleneck, replacing costly simulations with rapid neural network evaluations and enabling near-instant design for meta-atoms. Despite advances, implementing a new optical function still requires building and training a task-specific network, along with exhaustive searches for suitable architectures and hyperparameters. Pre-trained large language models (LLMs), by contrast, sidestep this laborious process with a simple fine-tuning technique. However, applying LLMs to the design of nanophotonic devices, particularly for arbitrarily shaped metasurfaces, is still in its early stages; as such tasks often require graphical networks. Here, we show that an LLM, fed with descriptive inputs of arbitrarily shaped metasurface geometries, can learn the physical relationships needed for spectral prediction and inverse design. We further benchmarked a range of open-weight LLMs and identified relationships between accuracy and model size at the billion-parameter level. We demonstrated that 1-D token-wise LLMs provide a practical tool to designing 2-D arbitrarily shaped metasurfaces. Linking natural-language interaction to electromagnetic modelling, this "chat-to-chip" workflow represents a step toward more user-friendly data-driven nanophotonics.
