Table of Contents
Fetching ...

Inverse Design in Nanophotonics via Representation Learning

Reza Marzban, Ali Adibi, Raphael Pestourie

TL;DR

The paper addresses the challenge of efficiently designing nanophotonic devices to match target electromagnetic responses, given high-dimensional non-convex design spaces and costly full-wave solvers. It reframes ML-based inverse design through representation learning, dividing methods into output-side approaches that learn differentiable solvers and input-side approaches that learn latent geometry priors, and it discusses hybrid pipelines that combine both. The survey covers concrete techniques such as PINNs, WaveY-Net, GLOnet, VAEs, GANs, and Bayesian optimization, analyzing trade-offs in data efficiency, generalization, and design discovery. It highlights open challenges and opportunities, including fabrication constraints, geometry-independent representations, multiphysics co-design, transfer learning, and the need for shared benchmarks, with an outlook toward autonomous, multi-agent design systems that can translate objectives into optimized nanoscale structures.

Abstract

Inverse design in nanophotonics, the computational discovery of structures achieving targeted electromagnetic (EM) responses, has become a key tool for recent optical advances. Traditional intuition-driven or iterative optimization methods struggle with the inherently high-dimensional, non-convex design spaces and the substantial computational demands of EM simulations. Recently, machine learning (ML) has emerged to address these bottlenecks effectively. This review frames ML-enhanced inverse design methodologies through the lens of representation learning, classifying them into two categories: output-side and input-side approaches. Output-side methods use ML to learn a representation in the solution space to create a differentiable solver that accelerates optimization. Conversely, input-side techniques employ ML to learn compact, latent-space representations of feasible device geometries, enabling efficient global exploration through generative models. Each strategy presents unique trade-offs in data requirements, generalization capacity, and novel design discovery potentials. Hybrid frameworks that combine physics-based optimization with data-driven representations help escape poor local optima, improve scalability, and facilitate knowledge transfer. We conclude by highlighting open challenges and opportunities, emphasizing complexity management, geometry-independent representations, integration of fabrication constraints, and advancements in multiphysics co-designs.

Inverse Design in Nanophotonics via Representation Learning

TL;DR

The paper addresses the challenge of efficiently designing nanophotonic devices to match target electromagnetic responses, given high-dimensional non-convex design spaces and costly full-wave solvers. It reframes ML-based inverse design through representation learning, dividing methods into output-side approaches that learn differentiable solvers and input-side approaches that learn latent geometry priors, and it discusses hybrid pipelines that combine both. The survey covers concrete techniques such as PINNs, WaveY-Net, GLOnet, VAEs, GANs, and Bayesian optimization, analyzing trade-offs in data efficiency, generalization, and design discovery. It highlights open challenges and opportunities, including fabrication constraints, geometry-independent representations, multiphysics co-design, transfer learning, and the need for shared benchmarks, with an outlook toward autonomous, multi-agent design systems that can translate objectives into optimized nanoscale structures.

Abstract

Inverse design in nanophotonics, the computational discovery of structures achieving targeted electromagnetic (EM) responses, has become a key tool for recent optical advances. Traditional intuition-driven or iterative optimization methods struggle with the inherently high-dimensional, non-convex design spaces and the substantial computational demands of EM simulations. Recently, machine learning (ML) has emerged to address these bottlenecks effectively. This review frames ML-enhanced inverse design methodologies through the lens of representation learning, classifying them into two categories: output-side and input-side approaches. Output-side methods use ML to learn a representation in the solution space to create a differentiable solver that accelerates optimization. Conversely, input-side techniques employ ML to learn compact, latent-space representations of feasible device geometries, enabling efficient global exploration through generative models. Each strategy presents unique trade-offs in data requirements, generalization capacity, and novel design discovery potentials. Hybrid frameworks that combine physics-based optimization with data-driven representations help escape poor local optima, improve scalability, and facilitate knowledge transfer. We conclude by highlighting open challenges and opportunities, emphasizing complexity management, geometry-independent representations, integration of fabrication constraints, and advancements in multiphysics co-designs.

Paper Structure

This paper contains 5 sections, 4 figures.

Figures (4)

  • Figure 1: Output-side versus input-side representation learning in nanophotonic inverse design. Top panels: two complementary learned representations. Output-side representation (left) models the partial-differential-equation (PDE) solution or a derived optical property: a differentiable surrogate or physics-informed neural network (PINN) emulates a Maxwell’s equations solver and provides analytic gradients that refine candidates directly in the full design space. Input-side representation (right) models the device geometry itself: a generative model compresses layouts into a low-dimensional latent manifold, and optimization proceeds in that manifold while a PDE solver (the surrogate model) supplies the objective. Bottom panel: schematic of the non-convex nanophotonic design landscape. Two representation-learning frameworks guide the search toward a better optimum (blue star): the output-side surrogate delivers rapid, physics-consistent gradients, and the input-side latent prior confines exploration to geometry regions containing high-performance candidates. Taken separately, each paradigm yields faster convergence, fewer full-wave EM simulations, and lower data requirements; their distinct mechanisms and trade-offs are analyzed in detail throughout this review.
  • Figure 2: Gradient-based co-optimization frameworks combining differentiable EM simulators with PINNs and neural surrogates.(a–c)Hard-constrained PINNs: Two NNs, $\hat{u}(\mathbf{x};\theta_{u})$ and $\hat{\gamma}(\mathbf{x};\theta_{\gamma})$, parameterize EM fields and design variables. Training involves a PDE-informed loss function $\mathcal{L}_{F}$ imposed through automatic differentiation. Dirichlet boundary conditions are enforced in the network outputs, and periodic boundary conditions are embedded via sinusoidal input features. (b) Computational domain showing permittivity design region $\Omega_{2}$ (blue) and perfectly matched layers (PML, hatched). (c) Predicted electric field intensity distribution $|E|^{2}$ resulting from the optimized permittivity $\varepsilon$. (d)Neural-adjoint patch solver: Pillar half-width vectors are transformed into dielectric patches, processed by a convolutional NN (CNN) predicting local EM fields. These fields are stitched together and propagated via the angular-spectrum method. The objective intensity $f=|E(z=F)|^{2}$ is back-propagated using automatic differentiation (PyTorch autograd) to iteratively update the pillar geometries. (e, f)GLOnet + WaveY-Net framework for global TO: (e) GLOnet generates metagrating designs from latent noise vectors; these designs are evaluated by the differentiable WaveY-Net surrogate EM solver. Loss gradients computed by WaveY-Net are back-propagated through GLOnet, enabling differentiable, end-to-end global optimization. (f) WaveY-Net architecture details: A U-Net-based CNN predicts magnetic near-fields, subsequently converted to electric fields via the discrete Ampère’s law. The training loss includes a data-fidelity term ($L_{\text{data}}$) and a Maxwell-residual regularizer ($L_{\text{Maxwell}}$) to keep gradient computations consistent with Maxwell’s equations. (g)Physics-enhanced deep surrogate (PEDS) framework: Fine-resolution geometries are downsampled and combined with coarse-resolution geometries generated by a neural surrogate. The resulting composite structures are evaluated by a fast, low-fidelity solver for rapid performance estimation. High-fidelity solver evaluations of fine-resolution geometries provide offline training data, accelerating the design optimization loop while maintaining physical accuracy. Permissions: Panels (a–c) Reproduced with permission.lu2021physicsCopyright 2021, Society for Industrial and Applied Mathematics (SIAM). Panel (d) Reproduced under the terms of the CC BY 4.0 license.zhelyeznyakov2023largeCopyright 2023, The Author(s). Panels (e, f) Reproduced with permission.chen2022highCopyright 2022, American Chemical Society. Panel (g) Reproduced with permission.pestourie2023physicsCopyright 2023, Springer Nature.
  • Figure 3: Offline, data-driven hybrid inverse-design workflows.(a, b) Conditional adversarial auto-encoder (c-AAE) pipeline. (a) Antenna topology, along with geometric parameters (unit-cell size, spacer thickness), is encoded into a 17-dimensional latent space. A generator–discriminator pair enforces adherence to a predefined prior, producing a compact, physics-informed design manifold. (b) The trained generator $G$ couples to a conditional VGG-based surrogate model for rapid offline prediction of optical efficiency, allowing rapid synthesis and screening of candidates. (c–e) CNN-assisted WGAN optimization for reconfigurable photonic waveguides. (c) Schematic of a three-channel silicon rib-waveguide array coated with Sb2Se3. A focused laser locally writes a $500\,\text{nm}$-pixel pattern along a $50\,\mu\text{m}$ section, enabling a dynamically reconfigurable optical coupling matrix. (d) varFDTD simulations show intensity maps for three input ports that confirm an anti-diagonal coupling matrix with uniform phases. (e) End-to-end inverse-design workflow: latent vectors $\mathbf{z}_i$ are transformed into pixel patterns via a WGAN generator. A NN-based transmission predictor computes differentiable performance estimates, and gradients $\partial\text{fitness}/\partial\mathbf{z}_i$ guide iterative updates of $\mathbf{z}_i$ until convergence, minimizing a combined mean-squared error (MSE) and phase loss. (f, g) HiLAB: TO combined with VAE and BO. (f) A Vision-Transformer-based VAE encodes $256\times128$ binary metasurface patterns into an eight-dimensional latent space, forming a smooth, fabrication-compatible manifold for optimization. (g) BO jointly explores the eight-dimensional latent geometry and the physical hyperparameters $\{t_{1}, t_{2}, \Lambda_{y}\}$. Each proposed candidate is decoded, binarized, and evaluated with full-wave FDTD simulations. The optimization aims to maximize the worst-case diffraction efficiency across three wavelengths (470 nm, 550 nm, and 660 nm) using FoM: $\mathrm{FoM}=\min\{\eta_{470}, \eta_{550}, \eta_{660}\}$. The Optimization progress is visualized through a 2D PCA (principal component analysis) projection of sampled designs. Permissions: Panels (a, b) Reproduced under the terms of the CC BY 4.0 license.kudyshev2020machineCopyright 2020, The Author(s). Panels (c–e) Reproduced under the terms of the CC BY 4.0 license.radford2025inverseCopyright 2025, The Author(s). Panels (f, g) Reproduced under the terms of the CC BY 4.0 license.marzban2025hilabCopyright 2025, The Author(s).
  • Figure 4: RL workflows for hybrid inverse design. (a–c) L2DO nanocavity synthesis. (a) Short L3 InP nanobeam cavity with symmetric taper and mirror holes $(x_{1},x_{2},\ldots ,x_{m})$. (b) User-specified optical targets are fed to the L2DO engine. (c) Deep-RL loop: a four-layer MLP policy (PPO/DQN) interacts with an FDTD environment; replay-buffered experience tuples $(s_{t},a_{t},r_{t},s_{t+1})$ guide optimization of hole positions, radii, and counts. (d–f) Physics-informed RL (PIRL) for metagrating optimization. (d) Binary state encoding of a Si/SiO$_2$(Si: silicon, SiO$_2$: silicon oxide) meta-grating; objective—maximize first-order TM deflection efficiency $\eta$. (e) Physics-informed pre-training: a U-Net learns sensitivity maps $\Delta\eta_{\mathrm{approx}}$ from adjoint analysis (plot compares exact, adjoint-approximate, and NN-predicted $\Delta\eta$). (f) Parallel Deep-Q stage: the pretrained agent $Q_{0}^{\omega}$ is cloned into 16 workers running full-wave simulations; trajectories populate a global replay buffer while the master network $Q^{\omega}$ is synchronously updated, enabling efficient, large-scale exploration of the metagrating design space. Permissions: Panels (a–c) Reproduced under the terms of the CC BY 4.0 license.li2023deepCopyright 2023, The Author(s). Panels (d–f) Reproduced under the terms of the CC BY 4.0 license.park2024sampleCopyright 2024, The Author(s).