Table of Contents
Fetching ...

Learning Electromagnetic Metamaterial Physics With ChatGPT

Darui Lu, Yang Deng, Jordan M. Malof, Willie J. Padilla

TL;DR

This work investigates whether a fine-tuned ChatGPT-3.5 can learn the physics of all-dielectric metamaterials by predicting the absorptivity spectrum from a 14-parameter unit-cell geometry. By encoding geometry and spectra as text, the authors train FT-LLM to perform forward predictions and compare it against traditional ML models across varying dataset sizes, finding that FT-LLM achieves competitive mean absolute relative error ($\text{MARE}$) with deep neural networks and, with larger datasets, comparable mean squared error ($\text{MSE}$). The study further probes interpretability and inverse design, finding that FT-LLM can provide textual explanations but does not outperform baselines in interpretability or inverse-design performance, especially when handling larger outputs. These results suggest LLMs can serve as data-efficient surrogates for forward metamaterial physics and data summarization, though practical deployment for design tasks requires further advances in reliability and inverse-design capabilities. Overall, the paper highlights the potential and current limitations of applying large language models to physics-based metamaterial research.

Abstract

Large language models (LLMs) such as ChatGPT, Gemini, LlaMa, and Claude are trained on massive quantities of text parsed from the internet and have shown a remarkable ability to respond to complex prompts in a manner often indistinguishable from humans. For all-dielectric metamaterials consisting of unit cells with four elliptical resonators, we present a LLM fine-tuned on up to 40,000 data that can predict the absorptivity spectrum given a text prompt that only specifies the metasurface geometry. Results are compared to conventional machine learning approaches including feed-forward neural networks, random forest, linear regression, and K-nearest neighbor (KNN). Remarkably, the fine-tuned LLM (FT-LLM) achieves a comparable performance across large dataset sizes with a deep neural network. We also explore inverse problems by asking the LLM to predict the geometry necessary to achieve a desired spectrum. LLMs possess several advantages over humans that may give them benefits for research, including the ability to process enormous amounts of data, find hidden patterns in data, and operate in higher-dimensional spaces. This suggests they may be able to leverage their general knowledge of the world to learn faster from training data than traditional models, making them valuable tools for research and analysis.

Learning Electromagnetic Metamaterial Physics With ChatGPT

TL;DR

This work investigates whether a fine-tuned ChatGPT-3.5 can learn the physics of all-dielectric metamaterials by predicting the absorptivity spectrum from a 14-parameter unit-cell geometry. By encoding geometry and spectra as text, the authors train FT-LLM to perform forward predictions and compare it against traditional ML models across varying dataset sizes, finding that FT-LLM achieves competitive mean absolute relative error () with deep neural networks and, with larger datasets, comparable mean squared error (). The study further probes interpretability and inverse design, finding that FT-LLM can provide textual explanations but does not outperform baselines in interpretability or inverse-design performance, especially when handling larger outputs. These results suggest LLMs can serve as data-efficient surrogates for forward metamaterial physics and data summarization, though practical deployment for design tasks requires further advances in reliability and inverse-design capabilities. Overall, the paper highlights the potential and current limitations of applying large language models to physics-based metamaterial research.

Abstract

Large language models (LLMs) such as ChatGPT, Gemini, LlaMa, and Claude are trained on massive quantities of text parsed from the internet and have shown a remarkable ability to respond to complex prompts in a manner often indistinguishable from humans. For all-dielectric metamaterials consisting of unit cells with four elliptical resonators, we present a LLM fine-tuned on up to 40,000 data that can predict the absorptivity spectrum given a text prompt that only specifies the metasurface geometry. Results are compared to conventional machine learning approaches including feed-forward neural networks, random forest, linear regression, and K-nearest neighbor (KNN). Remarkably, the fine-tuned LLM (FT-LLM) achieves a comparable performance across large dataset sizes with a deep neural network. We also explore inverse problems by asking the LLM to predict the geometry necessary to achieve a desired spectrum. LLMs possess several advantages over humans that may give them benefits for research, including the ability to process enormous amounts of data, find hidden patterns in data, and operate in higher-dimensional spaces. This suggests they may be able to leverage their general knowledge of the world to learn faster from training data than traditional models, making them valuable tools for research and analysis.
Paper Structure (25 sections, 3 equations, 4 figures, 3 tables)

This paper contains 25 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Schematic depicting the workflow. The process begins with simulations (orange rectangle) to acquire the metamaterial geometry-spectrum dataset. Subsequently, these numerical geometric parameters are transformed into textual descriptions -- shown by the numerical encoding path. We then fine-tune ChatGPT 3.5 using OpenAI's API with the built-in loss shown as the blue box.
  • Figure 2: Evaluations of the models for predicting metamaterial spectra, given a geometry, as a function of dataset size. (a) MARE and (b) MSE trends for baseline models and the FT-LLM as dataset size increases. All results of baseline models presented are the average of three models. Error bars indicate the standard deviation of the three trails. However, due to computational resource limitations, we only conducted single trial for the GPT model at large dataset sizes (10,000, 20,000, and 40,000 samples). While this may introduce some variances in the results, the observed trends are consistent with expectations. A temperature of $0.5$ was used for the GPT model.
  • Figure 3: Evaluations of model performance using two different prompt templates. These templates are provided in Table \ref{['template']}. All the results presented are averages from three models. However, the results at the 10,000 data points are exceptions, as computational constraints limited these to single trials.
  • Figure 4: MSE trends for the model fine-tuned on different numbers of training samples as the temperature increases. The MSE is in $\text{log}_{10}$ scale.