Table of Contents
Fetching ...

Practical Topics in Optimization

Jun Lu

TL;DR

Practical Topics in Optimization surveys core mathematical tools and a breadth of optimization methods, bridging theory and practice. It consolidates linear algebra, normed spaces, and fundamental inequalities to underpin first- and second-order methods, stochastic optimization, and constrained techniques. The book details essential decompositions (Cholesky, QR, SVD, eigen) and convergence concepts, equipping readers to analyze and implement robust optimization algorithms. It also clarifies gradient-based strategies (including SGD and its role in large models) and touches advanced topics like ADMM, proximal methods, and trust-region frameworks for large-scale problems. Overall, the work serves as a self-contained reference for students, researchers, and practitioners seeking rigorous yet accessible optimization foundations and techniques.

Abstract

In an era where data-driven decision-making and computational efficiency are paramount, optimization plays a foundational role in advancing fields such as mathematics, computer science, operations research, machine learning, and beyond. From refining machine learning models to improving resource allocation and designing efficient algorithms, optimization techniques serve as essential tools for tackling complex problems. This book aims to provide both an introductory guide and a comprehensive reference, equipping readers with the necessary knowledge to understand and apply optimization methods within their respective fields. Our primary goal is to demystify the inner workings of optimization algorithms, including black-box and stochastic optimizers, by offering both formal and intuitive explanations. Starting from fundamental mathematical principles, we derive key results to ensure that readers not only learn how these techniques work but also understand when and why to apply them effectively. By striking a careful balance between theoretical depth and practical application, this book serves a broad audience, from students and researchers to practitioners seeking robust optimization strategies.

Practical Topics in Optimization

TL;DR

Practical Topics in Optimization surveys core mathematical tools and a breadth of optimization methods, bridging theory and practice. It consolidates linear algebra, normed spaces, and fundamental inequalities to underpin first- and second-order methods, stochastic optimization, and constrained techniques. The book details essential decompositions (Cholesky, QR, SVD, eigen) and convergence concepts, equipping readers to analyze and implement robust optimization algorithms. It also clarifies gradient-based strategies (including SGD and its role in large models) and touches advanced topics like ADMM, proximal methods, and trust-region frameworks for large-scale problems. Overall, the work serves as a self-contained reference for students, researchers, and practitioners seeking rigorous yet accessible optimization foundations and techniques.

Abstract

In an era where data-driven decision-making and computational efficiency are paramount, optimization plays a foundational role in advancing fields such as mathematics, computer science, operations research, machine learning, and beyond. From refining machine learning models to improving resource allocation and designing efficient algorithms, optimization techniques serve as essential tools for tackling complex problems. This book aims to provide both an introductory guide and a comprehensive reference, equipping readers with the necessary knowledge to understand and apply optimization methods within their respective fields. Our primary goal is to demystify the inner workings of optimization algorithms, including black-box and stochastic optimizers, by offering both formal and intuitive explanations. Starting from fundamental mathematical principles, we derive key results to ensure that readers not only learn how these techniques work but also understand when and why to apply them effectively. By striking a careful balance between theoretical depth and practical application, this book serves a broad audience, from students and researchers to practitioners seeking robust optimization strategies.

Paper Structure

This paper contains 433 sections, 1383 equations, 57 figures, 4 tables, 49 algorithms.

Figures (57)

  • Figure 1: Demonstration of Young's inequality for different cases.
  • Figure 2: Unit ball of $\ell_p$ norms in three-dimensional space. When $p<1$, the metric does not qualify as a norm since it does not satisfy the third axiom of the norm in Definition \ref{['definition:matrix-norm']}.
  • Figure 3: Loss surfaces for different quadratic forms, providing the surface plots and contour plots (blue=low, yellow=high), where the upper graphs are the surface plots, and the lower ones are their projection (i.e., contours).
  • Figure 4: Plot for the function $f(x, y) = \sqrt{x^2+y^2}$, in which case any directional derivative for the direction $\bm{d}=[a,b]^\top$ with $a\neq 0$ and $b\neq 0$ at point $[0,0]^\top$ exists. However, the partial derivatives at this point do not exist.
  • Figure 5: Two pairs of orthogonal subspaces in $\mathbb{R}^n$ and $\mathbb{R}^m$. $\dim(\mathcal{C}(\bm{A}^\top)) + \dim(\mathcal{N}(\bm{A}))=n$ and $\dim(\mathcal{N}(\bm{A}^\top)) + \dim(\mathcal{C}(\bm{A}))=m$. The null space component maps to zero as $\bm{A}\bm{x}_n = \mathbf{0} \in \mathbb{R}^m$. The row space component maps to the column space as $\bm{A}\bm{x}_r = \bm{A}(\bm{x}_r+\bm{x}_n)=\bm{b} \in \mathcal{C}(\bm{A})$.
  • ...and 52 more figures

Theorems & Definitions (117)

  • Definition 1.1: Matlab Notation
  • Definition 1.2: Nonnegative Orthant, Positive Orthant, and Unit-Simplex
  • Definition 1.3: Eigenvalue, Eigenvector
  • Definition 1.4: Subspace
  • Definition 1.5: Span
  • Definition 1.6: Linearly Independent
  • Definition 1.7: Basis and Dimension
  • Definition 1.8: Column Space (Range)
  • Definition 1.9: Null Space (Nullspace, Kernel)
  • Definition 1.10: Rank
  • ...and 107 more