Table of Contents
Fetching ...

Scalable Bilevel Optimization for Generating Maximally Representative OPF Datasets

Ignasi Ventura Nadal, Samuel Chevalier

TL;DR

This work tackles the challenge of generating OPF datasets that accurately reflect the operating space near system limits, where active constraints are prevalent. It introduces RAMBO, a bilevel data-collection routine that deliberately selects OPF load inputs to maximize distance from previously collected solutions, thereby sampling near-boundary regions. The approach leverages a KKT-based reformulation with a relaxed complementary-slackness constraint and incorporates scalability and temperature controls via stochastic subsets and warm-start techniques. Empirical results on IEEE 30-, 57-, and 118-bus PGLib cases show RAMBO produces richer variable ranges and a markedly higher number of unique active-constraint sets than uniform sampling, improving data representativeness for downstream ML and validation tasks. This method, with a public repository for reproducibility, advances data-driven power-system modeling by enabling more robust learning and testing of constraint-aware behaviors.

Abstract

New generations of power systems, containing high shares of renewable energy resources, require improved data-driven tools which can swiftly adapt to changes in system operation. Many of these tools, such as ones using machine learning, rely on high-quality training datasets to construct probabilistic models. Such models should be able to accurately represent the system when operating at its limits (i.e., operating with a high degree of ``active constraints"). However, generating training datasets that accurately represent the many possible combinations of these active constraints is a particularly challenging task, especially within the realm of nonlinear AC Optimal Power Flow (OPF), since most active constraints cannot be enforced explicitly. Using bilevel optimization, this paper introduces a data collection routine that sequentially solves for OPF solutions which are ``optimally far" from previously acquired voltage, power, and load profile data points. The routine, termed RAMBO, samples critical data close to a system's boundaries much more effectively than a random sampling benchmark. Simulated test results are collected on the 30-, 57-, and 118-bus PGLib test cases.

Scalable Bilevel Optimization for Generating Maximally Representative OPF Datasets

TL;DR

This work tackles the challenge of generating OPF datasets that accurately reflect the operating space near system limits, where active constraints are prevalent. It introduces RAMBO, a bilevel data-collection routine that deliberately selects OPF load inputs to maximize distance from previously collected solutions, thereby sampling near-boundary regions. The approach leverages a KKT-based reformulation with a relaxed complementary-slackness constraint and incorporates scalability and temperature controls via stochastic subsets and warm-start techniques. Empirical results on IEEE 30-, 57-, and 118-bus PGLib cases show RAMBO produces richer variable ranges and a markedly higher number of unique active-constraint sets than uniform sampling, improving data representativeness for downstream ML and validation tasks. This method, with a public repository for reproducibility, advances data-driven power-system modeling by enabling more robust learning and testing of constraint-aware behaviors.

Abstract

New generations of power systems, containing high shares of renewable energy resources, require improved data-driven tools which can swiftly adapt to changes in system operation. Many of these tools, such as ones using machine learning, rely on high-quality training datasets to construct probabilistic models. Such models should be able to accurately represent the system when operating at its limits (i.e., operating with a high degree of ``active constraints"). However, generating training datasets that accurately represent the many possible combinations of these active constraints is a particularly challenging task, especially within the realm of nonlinear AC Optimal Power Flow (OPF), since most active constraints cannot be enforced explicitly. Using bilevel optimization, this paper introduces a data collection routine that sequentially solves for OPF solutions which are ``optimally far" from previously acquired voltage, power, and load profile data points. The routine, termed RAMBO, samples critical data close to a system's boundaries much more effectively than a random sampling benchmark. Simulated test results are collected on the 30-, 57-, and 118-bus PGLib test cases.
Paper Structure (15 sections, 12 equations, 3 figures, 3 tables, 2 algorithms)

This paper contains 15 sections, 12 equations, 3 figures, 3 tables, 2 algorithms.

Figures (3)

  • Figure 1: Depicted is a 3D representation of the proposed bilevel optimization. The decision variables of the routine determine the axes: voltage magnitude, power generation, and demand. Within the feasible space (turquoise), three local optima are identified. These optima show how the OPF solution space is explored by maximizing distances between decision variables.
  • Figure 2: Algorithmic flow of RAMBO (figure inspired by Jones:2022)
  • Figure 3: Voltage magnitude feasible space sampled by the bilevel optimization (blue) and the uniform distribution (orange). The blue profiles cover a significantly broader region of the OPF solution space.