Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

Yinuo Ren; Tesi Xiao; Tanmay Gangwani; Anshuka Rangi; Holakou Rahmanian; Lexing Ying; Subhajit Sanyal

Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

Yinuo Ren, Tesi Xiao, Tanmay Gangwani, Anshuka Rangi, Holakou Rahmanian, Lexing Ying, Subhajit Sanyal

TL;DR

This work introduces a novel interacting particle method for MOO inspired by molecular dynamics simulations that combines overdamped Langevin and birth-death dynamics, incorporating a dominance potential to steer particles toward global Pareto optimality.

Abstract

Multi-objective optimization (MOO) aims to optimize multiple, possibly conflicting objectives with widespread applications. We introduce a novel interacting particle method for MOO inspired by molecular dynamics simulations. Our approach combines overdamped Langevin and birth-death dynamics, incorporating a "dominance potential" to steer particles toward global Pareto optimality. In contrast to previous methods, our method is able to relocate dominated particles, making it particularly adept at managing Pareto fronts of complicated geometries. Our method is also theoretically grounded as a Wasserstein-Fisher-Rao gradient flow with convergence guarantees. Extensive experiments confirm that our approach outperforms state-of-the-art methods on challenging synthetic and real-world datasets.

Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

TL;DR

Abstract

Paper Structure (40 sections, 3 theorems, 67 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 40 sections, 3 theorems, 67 equations, 9 figures, 1 table, 1 algorithm.

INTRODUCTION
Contributions.
PRELIMINARIES
Pareto Optimality
Wasserstein-Fisher-Rao Gradient Flow
ALGORITHM
Designing the Functional $\mathcal{E}[\rho]$
Objective Function Term.
Dominance Potential Term.
Entropy Term.
Repulsive Potential Term.
Theoretical Analysis
Interacting Particle Method
Overdamped Langevin Dynamics.
Birth-death Dynamics.
...and 25 more sections

Key Result

Theorem 1

Let $\rho_t$ follow the WFR gradient flow of $\mathcal{E}[\rho]$eq:wfrgf, then the following decay of the functional value $\mathcal{E}[\rho_t]$ holds: Furthermore, if $\beta>0$ or $\gamma>0$, the density $\rho_t$ converges to the unique minimizer $\rho^*$ of $\mathcal{E}[\rho]$, as $t\to\infty$.

Figures (9)

Figure 1: Illustration of structural potential terms. This visualization explains the role of the dominance potential $\mathcal{F}_2$ and the repulsive potential $\mathcal{G}$ in a setting with two objective functions ($m=2$). (a) Suppose $\mathbf{y}$ is from $\mu_{\mathcal{P}}(\cdot)$ with corresponding objective function values $\mathbf{f}(\mathbf{y}) = (f_1(\mathbf{y}), f_2(\mathbf{y}))$. When a point $\mathbf{x}$ is introduced, the dominance kernel $D(\cdot, \mathbf{f}(\mathbf{y}))$ acts to shift $\mathbf{x}$ out of the region dominated by $\mathbf{y}$. (b) For two samples $\mathbf{x}$ and $\mathbf{y}$ the repulsive kernel $R(\cdot, \mathbf{f}(\mathbf{y}))$ repels the objective function values of $\mathbf{x}$ away from those of $\mathbf{y}$.
Figure 2: Performance comparison of different methods on the ZDT3 problem. The Pareto front is shown in red, and the solutions found by different methods are shown in blue. Our method (Particle-WFT) perfectly captures the complicated geometry of the Pareto front.
Figure 3: Evolution of the particle population by Particle-WFR on the ZDT3 problem. The Pareto front is shown in red, and the current population is shown in blue.
Figure 4: Performance comparison of different methods on the DTLZ7 problem. The Pareto front is shown in red, and the solutions found by different methods are shown in blue. Alongside the 3D visualization, a bird's-eye perspective is also provided for each problem. Our method (Particle-WFR) achieves the best coverage of the Pareto front across all methods.
Figure 5: Performance comparison of different methods on the MSLR-WEB30K dataset. Our method achieves the best HV value on test NDCG@10 and performance improves as particle count $N$ increases from 8 to 16.
...and 4 more figures

Theorems & Definitions (9)

Definition 1: Pareto optimality
Theorem 1
Theorem 2
Remark 1
Remark 2
Remark 3
proof : Proof of Theorem \ref{['thm:convergence']}
Lemma 3: lu2023birth
proof : Proof of Theorem \ref{['thm:exponential']}

Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

TL;DR

Abstract

Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (9)