Pushing the Limits of Inverse Lithography with Generative Reinforcement Learning
Haoyu Yang, Haoxing Ren
TL;DR
This work rethinks inverse lithography by treating mask synthesis as conditional sampling and training a style-aware generator to propose multiple design-conditioned masks. Through a two-stage process—generative pre-training and reinforcement finetuning with Group Relative Policy Optimization—the sampler learns a posterior over masks that accelerates ILT refinement and helps escape non-convex traps. Empirical results on LithoBench and ICCAD13 show state-of-the-art or competitive EPE with significant throughput improvements, including robust performance at 3 nm and substantial speedups. The framework generalizes to arbitrary sizes, integrates with existing ILT pipelines, and supports flexible multi-objective rewards, offering a scalable path toward practical, high-quality lithography mask synthesis.
Abstract
Inverse lithography (ILT) is critical for modern semiconductor manufacturing but suffers from highly non-convex objectives that often trap optimization in poor local minima. Generative AI has been explored to warm-start ILT, yet most approaches train deterministic image-to-image translators to mimic sub-optimal datasets, providing limited guidance for escaping non-convex traps during refinement. We reformulate mask synthesis as conditional sampling: a generator learns a distribution over masks conditioned on the design and proposes multiple candidates. The generator is first pretrained with WGAN plus a reconstruction loss, then fine-tuned using Group Relative Policy Optimization (GRPO) with an ILT-guided imitation loss. At inference, we sample a small batch of masks, run fast batched ILT refinement, evaluate lithography metrics (e.g., EPE, process window), and select the best candidate. On \texttt{LithoBench} dataset, the proposed hybrid framework reduces EPE violations under a 3\,nm tolerance and roughly doubles throughput versus a strong numerical ILT baseline, while improving final mask quality. We also present over 20\% EPE improvement on \texttt{ICCAD13} contest cases with 3$\times$ speedup over the SOTA numerical ILT solver. By learning to propose ILT-friendly initializations, our approach mitigates non-convexity and advances beyond what traditional solvers or GenAI can achieve.
