Table of Contents
Fetching ...

ADMM for Structured Fractional Minimization

Ganzhao Yuan

TL;DR

This paper introduces the first Alternating Direction Method of Multipliers tailored for fractional minimization problems, and establishes that {\sf FADMM} converges to $\epsilon$-approximate critical points of the problem within an oracle complexity of $\mathcal{O}(1/\epsilon^{3})$.

Abstract

This paper considers a class of structured fractional minimization problems. The numerator consists of a differentiable function, a simple nonconvex nonsmooth function, a concave nonsmooth function, and a convex nonsmooth function composed with a linear operator. The denominator is a continuous function that is either weakly convex or has a weakly convex square root. These problems are prevalent in various important applications in machine learning and data science. Existing methods, primarily based on subgradient methods and smoothing proximal gradient methods, often suffer from slow convergence and numerical stability issues. In this paper, we introduce {\sf FADMM}, the first Alternating Direction Method of Multipliers tailored for this class of problems. {\sf FADMM} decouples the original problem into linearized proximal subproblems, featuring two variants: one using Dinkelbach's parametric method ({\sf FADMM-D}) and the other using the quadratic transform method ({\sf FADMM-Q}). By introducing a novel Lyapunov function, we establish that {\sf FADMM} converges to $ε$-approximate critical points of the problem within an oracle complexity of $\mathcal{O}(1/ε^{3})$. Extensive experiments on synthetic and real-world datasets, including sparse Fisher discriminant analysis, robust Sharpe ratio minimization, and robust sparse recovery, demonstrate the effectiveness of our approach. Keywords: Fractional Minimization, Nonconvex Optimization, Proximal Linearized ADMM, Nonsmooth Optimization, Convergence Analysis

ADMM for Structured Fractional Minimization

TL;DR

This paper introduces the first Alternating Direction Method of Multipliers tailored for fractional minimization problems, and establishes that {\sf FADMM} converges to -approximate critical points of the problem within an oracle complexity of .

Abstract

This paper considers a class of structured fractional minimization problems. The numerator consists of a differentiable function, a simple nonconvex nonsmooth function, a concave nonsmooth function, and a convex nonsmooth function composed with a linear operator. The denominator is a continuous function that is either weakly convex or has a weakly convex square root. These problems are prevalent in various important applications in machine learning and data science. Existing methods, primarily based on subgradient methods and smoothing proximal gradient methods, often suffer from slow convergence and numerical stability issues. In this paper, we introduce {\sf FADMM}, the first Alternating Direction Method of Multipliers tailored for this class of problems. {\sf FADMM} decouples the original problem into linearized proximal subproblems, featuring two variants: one using Dinkelbach's parametric method ({\sf FADMM-D}) and the other using the quadratic transform method ({\sf FADMM-Q}). By introducing a novel Lyapunov function, we establish that {\sf FADMM} converges to -approximate critical points of the problem within an oracle complexity of . Extensive experiments on synthetic and real-world datasets, including sparse Fisher discriminant analysis, robust Sharpe ratio minimization, and robust sparse recovery, demonstrate the effectiveness of our approach. Keywords: Fractional Minimization, Nonconvex Optimization, Proximal Linearized ADMM, Nonsmooth Optimization, Convergence Analysis

Paper Structure

This paper contains 47 sections, 21 theorems, 106 equations, 10 figures, 2 algorithms.

Key Result

Lemma 3.9

(Proof in Appendix app:lemma:nesterov:smoothing) Assume that $h(\mathbf{y})$ is $C_h$-Lipschitz continuous. We let $\mu>0$, and $0< \mu_2\leq \mu_1$. We have the following results:

Figures (10)

  • Figure 1: Results on sparse FDA on different datasets with $\rho=10$.
  • Figure 2: Results on sparse FDA on different datasets with $\rho=1000$.
  • Figure 3: Experimental results on sparse FDA on different datasets with $\rho=100$.
  • Figure 4: Experimental results on sparse FDA on different datasets with $\rho=10000$.
  • Figure 5: Results on Sharpe ratio maximization on different datasets.
  • ...and 5 more figures

Theorems & Definitions (55)

  • Remark 3.5
  • Definition 3.6
  • Remark 3.7
  • Definition 3.8
  • Lemma 3.9
  • Lemma 3.10
  • Remark 3.11
  • Remark 4.1
  • Remark 5.2
  • Lemma 5.3
  • ...and 45 more