Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique

Andrew Kiruluta; Andreas Lemos

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique

Andrew Kiruluta, Andreas Lemos

TL;DR

The paper tackles enforcing linear-programming feasibility inside unsupervised representation learning by embedding LP constraints and objectives into the loss of an autoencoder. It introduces LP--AE, a differentiable, penalty-based framework that propagates constraint violations through a squared-hinge barrier, enabling end-to-end training without inner solvers or labelled optimal solutions. The authors prove coercivity, asymptotic feasibility, and an LP-gap bound, and demonstrate a 3× inference speedup and substantial throughput gains on real hospital scheduling data with high feasibility and small objective gaps. Empirically, LP--AE shows robustness to noise and missing features, and ablation studies illustrate the importance of penalty scheduling and latent dimensionality. The approach offers a practical, scalable, solver-free alternative for constraint-aware learning applicable to logistics, healthcare, and energy domains, with reproducible code and data provided.

Abstract

This paper presents a novel hybrid approach that integrates linear programming (LP) within the loss function of an unsupervised machine learning model. By leveraging the strengths of both optimization techniques and machine learning, this method introduces a robust framework for solving complex optimization problems where traditional methods may fall short. The proposed approach encapsulates the constraints and objectives of a linear programming problem directly into the loss function, guiding the learning process to adhere to these constraints while optimizing the desired outcomes. This technique not only preserves the interpretability of linear programming but also benefits from the flexibility and adaptability of machine learning, making it particularly well-suited for unsupervised or semi-supervised learning scenarios.

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique

TL;DR

Abstract

Paper Structure (54 sections, 2 theorems, 18 equations, 1 figure, 1 table, 1 algorithm)

This paper contains 54 sections, 2 theorems, 18 equations, 1 figure, 1 table, 1 algorithm.

Introduction
Contributions.
Related Work
Background
Linear Programming Essentials
Canonical primal form.
Lagrangian dual and strong duality.
Karush–Kuhn–Tucker (KKT) conditions.
Geometric intuition.
Complexity and scalability.
Sensitivity and parametric analysis.
Autoencoders for Unsupervised Representation
Deterministic formulation.
Probabilistic view and variational relaxation.
Regularized variants.
...and 39 more sections

Key Result

Lemma 1

Fix $\lambda>0$. The map $h(\boldsymbol z)= \lambda\,\phi(A\boldsymbol z-\boldsymbol b)- \mu\,\boldsymbol c^{\!\top}\boldsymbol z$ is coercive on $\mathbb{R}^{n}$; i.e. $h(\boldsymbol z)\!\to\!\infty$ as $\lVert\boldsymbol z\rVert\!\to\!\infty$. Consequently the empirical risk $\mathcal{J}(\theta)$

Figures (1)

Figure 1: Architecture of the LP‑aware Autoencoder (LP--AE). Blue nodes depict the autoencoder core: an encoder $f_{\theta_E}\!: \mathbb{R}^{d}\!\to\!\mathbb{R}^{n}$ maps an input sample $\boldsymbol{x}$ to a latent decision vector$\hat{\boldsymbol{z}}\!=\!f_{\theta_E}(\boldsymbol{x})$, which the decoder $g_{\theta_D}$ reconstructs to $\hat{\boldsymbol{x}}$. Yellow nodes form the constraint branch. The affine map $A\hat{\boldsymbol{z}}-\boldsymbol{b}$ evaluates all $m$ linear constraints, and the squared‑hinge barrier $\phi(\boldsymbol{u})=\sum_{j=1}^{m}\max\{0,u_j\}^{2}$ produces the violation loss $\lambda\,\phi(A\hat{\boldsymbol{z}}-\boldsymbol{b})$ that quadratically penalizes any component with $u_j>0$. Lavender nodes constitute the objective‑bias branch: the linear objective $c^{\top}\hat{\boldsymbol{z}}$ is scaled by a small factor $-\mu$ so that higher LP value reduces the total loss. Green nodes lie in data space; the reconstruction loss $\lVert\boldsymbol{x}-\hat{\boldsymbol{x}}\rVert_{2}^{2}$ (light green) and the two LP terms sum to the global objective $\mathcal{L}(\boldsymbol{x};\theta)= \lVert\boldsymbol{x}-g_{\theta_D}(f_{\theta_E}(\boldsymbol{x}))\rVert^{2} \!+\! \lambda\,\phi\!\bigl(A\hat{\boldsymbol{z}}-\boldsymbol{b}\bigr) -\mu\,c^{\top}\hat{\boldsymbol{z}}.$ The red octagon aggregates these three components. Dashed grey arrows indicate gradients propagated by back‑propagation; their cost is dominated by two dense products $A\hat{\boldsymbol{z}}$ and $A^{\!\top}\sigma(\cdot)$, giving per‑sample complexity $\mathcal{O}(mn)$. After training, the latent code $\hat{\boldsymbol{z}}^{\star}$ is emitted (dashed black edge) as a feasible, near‑optimal decision: as $\lambda\!\to\!\infty$, Proposition \ref{['prop:feasible']} guarantees $A\hat{\boldsymbol{z}}^{\star}\!\le\!\boldsymbol{b}$ and the objective gap is bounded by $(\lambda/\mu)\,\phi(\cdot)\!\to\!0$.

Theorems & Definitions (4)

Lemma 1: Coercivity of the penalized objective
proof
Proposition 1: Asymptotic feasibility
proof : Sketch

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique

TL;DR

Abstract

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (4)