Table of Contents
Fetching ...

Numerical field optimization for enhanced efficiency in time-reversible gradient computation of open-source GPU-accelerated FDTD simulations

Yannik Mahlau, Lukas Berg, Bodo Rosenhahn

Abstract

Finite-difference time-domain (FDTD) simulations often involve physical quantities spanning multiple orders of magnitude, such as the speed of light or electromagnetic field amplitudes. The standard practice for maintaining numerical accuracy in many FDTD implementations is to use 32-bit or 64-bit floating-point values to represent the electric and magnetic fields. However, this approach is not always optimal when recording field values, particularly during time-reversible gradient computation where electric and magnetic field values need to be saved at the boundary of the simulation domain. Since this memory bottleneck is often the limiting factor in time-reversible inverse design for nanophotonics, we present two field optimizations for enhancing memory efficiency in FDTD simulations. Using a smaller bit-width representation of field values as well as interpolation, we achieve similar accuracy at lower memory cost. This approach is particularly beneficial for GPU-accelerated computing, where reduced-precision data types are increasingly preferred due to their computational efficiency and prevalence in machine learning frameworks. We integrate our approach into FDTDX, an open-source, differentiable FDTD solver that natively supports time-reversible gradient computation. Our approach is especially important for future developments towards large-scale open-source simulations, which are critical for advancing computational nanophotonic applications.

Numerical field optimization for enhanced efficiency in time-reversible gradient computation of open-source GPU-accelerated FDTD simulations

Abstract

Finite-difference time-domain (FDTD) simulations often involve physical quantities spanning multiple orders of magnitude, such as the speed of light or electromagnetic field amplitudes. The standard practice for maintaining numerical accuracy in many FDTD implementations is to use 32-bit or 64-bit floating-point values to represent the electric and magnetic fields. However, this approach is not always optimal when recording field values, particularly during time-reversible gradient computation where electric and magnetic field values need to be saved at the boundary of the simulation domain. Since this memory bottleneck is often the limiting factor in time-reversible inverse design for nanophotonics, we present two field optimizations for enhancing memory efficiency in FDTD simulations. Using a smaller bit-width representation of field values as well as interpolation, we achieve similar accuracy at lower memory cost. This approach is particularly beneficial for GPU-accelerated computing, where reduced-precision data types are increasingly preferred due to their computational efficiency and prevalence in machine learning frameworks. We integrate our approach into FDTDX, an open-source, differentiable FDTD solver that natively supports time-reversible gradient computation. Our approach is especially important for future developments towards large-scale open-source simulations, which are critical for advancing computational nanophotonic applications.
Paper Structure (4 sections, 2 equations, 4 figures)

This paper contains 4 sections, 2 equations, 4 figures.

Figures (4)

  • Figure 1: Compression of an original sine wave (blue) in 32 bit precision (a) and 136 time steps per period (b). The original wave is compressed to a smaller data type (a) or subsampled at every k time steps (b). The phase offset between different curves in the plots is added for visualization purposes and has no connection to the compression.
  • Figure 2: Simulation setup for the inverse design of a grating coupler. A source (orange) injects light into the simulation towards the design region (pink). The design is optimized such that light is redirected into the output waveguide (blue). The ratio between input and output poynting flux is measured through two detectors (green). The simulation volume is surrounded by PML boundary objects (grey).
  • Figure 3: Mean cosine similarity of gradient computation using different data types and subsampling factors. Using float32 with recording every time step ($k = 1$) as a baseline, the compression factor is the memory saved through lower bit-width multiplied with $k$. The mean accuracy is calculated over 10 gradient calculations. The best results per subsampling factor are marked in bold.
  • Figure 4: Results of inverse design optimizations of a grating coupler. In (a), the transmission attenuation of designs optimized using data types float32 or float8_e4m3b11fnuz with different values of $k$ are shown. The five thin horizontal lines indicate the results of five runs started at random parameters. The best result obtained by using float8_e4m3b11fnuz and $k=16$ is visualized in (b).