MadNCL: A GPU Implementation of Algorithm NCL for Large-Scale, Degenerate Nonlinear Programs

Alexis Montoison; François Pacaud; Michael Saunders; Sungho Shin; Dominique Orban

MadNCL: A GPU Implementation of Algorithm NCL for Large-Scale, Degenerate Nonlinear Programs

Alexis Montoison, François Pacaud, Michael Saunders, Sungho Shin, Dominique Orban

TL;DR

This work develops MadNCL, a GPU-accelerated implementation of Algorithm NCL for large-scale, degenerate nonlinear programs. By formulating subproblems in stabilized ($K_{2r}$) and condensed ($K_{1s}$) KKT forms and leveraging NVIDIA cuDSS, MadNCL achieves robust descent directions without heavy pivoting, enabling efficient Newton steps on GPUs. The method fuses the outer and inner ALM loops and adds an extrapolation step to attain superlinear convergence, with MadNLP serving as the IPM subproblem solver. Numerical results on CUTEst, OPF/COPS benchmarks, and degenerate SCOPF/MPCC problems demonstrate strong robustness to degeneracy and substantial GPU speedups, validating the practical viability of GPU-based NCL for large-scale NLPs.

Abstract

We present a GPU implementation of Algorithm NCL, an augmented Lagrangian method for solving large-scale and degenerate nonlinear programs. Although interior-point methods and sequential quadratic programming are widely used for solving nonlinear programs, the augmented Lagrangian method is known to offer superior robustness against constraint degeneracies and can rapidly detect infeasibility. We introduce several enhancements to Algorithm NCL, including fusion of the inner and outer loops and use of extrapolation steps, which improve both efficiency and convergence stability. Further, NCL has the key advantage of being well-suited for GPU architectures because of the regularity of the KKT systems provided by quadratic penalty terms. In particular, the NCL subproblem formulation allows the KKT systems to be naturally expressed as either stabilized or condensed KKT systems, whereas the interior-point approach requires aggressive reformulations or relaxations to make it suitable for GPUs. Both systems can be efficiently solved on GPUs using sparse \ldlt factorization with static pivoting, as implemented in NVIDIA cuDSS. Building on these advantages, we examine the KKT systems arising from NCL subproblems. We present an optimized GPU implementation of Algorithm NCL by leveraging MadNLP as an interior-point subproblem solver and utilizing the stabilized and condensed formulations of the KKT systems for computing Newton steps. Numerical experiments on various large-scale and degenerate NLPs, including optimal power flow, COPS benchmarks, and security-constrained optimal power flow, demonstrate that MadNCL operates efficiently on GPUs while effectively managing problem degeneracy, including MPCC constraints.

MadNCL: A GPU Implementation of Algorithm NCL for Large-Scale, Degenerate Nonlinear Programs

TL;DR

Abstract

MadNCL: A GPU Implementation of Algorithm NCL for Large-Scale, Degenerate Nonlinear Programs

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (4)