Transfer learning-assisted inverse modeling in nanophotonics based on mixture density networks
Liang Cheng, Prashant Singh, Francesco Ferranti
TL;DR
The paper tackles the high computational cost of electromagnetic simulations for nanophotonic design by introducing a transfer-learning-augmented Mixture Density Network (MDN) for inverse modeling. By modeling the conditional distribution $p(\mathbf{y}|\mathbf{x})$ as a Gaussian mixture with $K$ components, the method captures multi-valued mappings from optical responses to design parameters, and uses transfer learning to efficiently scale across different $K$ while preserving accuracy. A dimensionality-reduction step via an autoencoder is explored to speed training on high-dimensional spectra. Numerical results on a grating-based multiband absorber demonstrate that the proposed TL strategies substantially reduce training time and maintain predictive quality, validating a practical, multi-solution inverse-design workflow for nanophotonics.
Abstract
The simulation of nanophotonic structures relies on electromagnetic solvers, which play a crucial role in understanding their behavior. However, these solvers often come with a significant computational cost, making their application in design tasks, such as optimization, impractical. To address this challenge, machine learning techniques have been explored for accurate and efficient modeling and design of photonic devices. Deep neural networks, in particular, have gained considerable attention in this field. They can be used to create both forward and inverse models. An inverse modeling approach avoids the need for coupling a forward model with an optimizer and directly performs the prediction of the optimal design parameters values. In this paper, we propose an inverse modeling method for nanophotonic structures, based on a mixture density network model enhanced by transfer learning. Mixture density networks can predict multiple possible solutions at a time including their respective importance as Gaussian distributions. However, multiple challenges exist for mixture density network models. An important challenge is that an upper bound on the number of possible simultaneous solutions needs to be specified in advance. Also, another challenge is that the model parameters must be jointly optimized, which can result computationally expensive. Moreover, optimizing all parameters simultaneously can be numerically unstable and can lead to degenerate predictions. The proposed approach allows overcoming these limitations using transfer learning-based techniques, while preserving a high accuracy in the prediction capability of the design solutions given an optical response as an input. A dimensionality reduction step is also explored. Numerical results validate the proposed method.
