Table of Contents
Fetching ...

A Unified Framework for Lifted Training and Inversion Approaches

Xiaoyu Wang, Alexandra Valavanis, Azhir Mahmood, Andreas Mang, Martin Benning, Audrey Repetti

TL;DR

The paper addresses the challenges of training deep neural networks with backpropagation and the ill-posedness of inverse problems by introducing a unified lifted framework that replaces nested optimization with higher-dimensional constrained formulations. It integrates MAC-QP, Fenchel Lifted, Lifted Contrastive, and Lifted Bregman approaches to enable layer-wise decoupling, support non-differentiable proximal activations, and facilitate distributed optimisation via block-coordinate methods and implicit stochastic gradient updates. The authors provide a comprehensive treatment across architecture families (perceptrons, MLPs, ResNets, proximal networks) and demonstrate applicability to inverse problems, including stable inversion and learning within unrolled/proximal networks, supported by numerical results on imaging tasks. The framework offers practical advantages in conditioning, robustness, and scalability, and yields rigorous convergence insights for single-layer network inversion while highlighting open questions for deep networks. Overall, it presents a principled, versatile pathway to combine convex-analytic lifting with modern neural architectures for robust learning and inverse problem solving.

Abstract

The training of deep neural networks predominantly relies on a combination of gradient-based optimisation and back-propagation for the computation of the gradient. While incredibly successful, this approach faces challenges such as vanishing or exploding gradients, difficulties with non-smooth activations, and an inherently sequential structure that limits parallelisation. Lifted training methods offer an alternative by reformulating the nested optimisation problem into a higher-dimensional, constrained optimisation problem where the constraints are no longer enforced directly but penalised with penalty terms. This chapter introduces a unified framework that encapsulates various lifted training strategies, including the Method of Auxiliary Coordinates, Fenchel Lifted Networks, and Lifted Bregman Training, and demonstrates how diverse architectures, such as Multi-Layer Perceptrons, Residual Neural Networks, and Proximal Neural Networks fit within this structure. By leveraging tools from convex optimisation, particularly Bregman distances, the framework facilitates distributed optimisation, accommodates non-differentiable proximal activations, and can improve the conditioning of the training landscape. We discuss the implementation of these methods using block-coordinate descent strategies, including deterministic implementations enhanced by accelerated and adaptive optimisation techniques, as well as implicit stochastic gradient methods. Furthermore, we explore the application of this framework to inverse problems, detailing methodologies for both the training of specialised networks (e.g., unrolled architectures) and the stable inversion of pre-trained networks. Numerical results on standard imaging tasks validate the effectiveness and stability of the lifted Bregman approach compared to conventional training, particularly for architectures employing proximal activations.

A Unified Framework for Lifted Training and Inversion Approaches

TL;DR

The paper addresses the challenges of training deep neural networks with backpropagation and the ill-posedness of inverse problems by introducing a unified lifted framework that replaces nested optimization with higher-dimensional constrained formulations. It integrates MAC-QP, Fenchel Lifted, Lifted Contrastive, and Lifted Bregman approaches to enable layer-wise decoupling, support non-differentiable proximal activations, and facilitate distributed optimisation via block-coordinate methods and implicit stochastic gradient updates. The authors provide a comprehensive treatment across architecture families (perceptrons, MLPs, ResNets, proximal networks) and demonstrate applicability to inverse problems, including stable inversion and learning within unrolled/proximal networks, supported by numerical results on imaging tasks. The framework offers practical advantages in conditioning, robustness, and scalability, and yields rigorous convergence insights for single-layer network inversion while highlighting open questions for deep networks. Overall, it presents a principled, versatile pathway to combine convex-analytic lifting with modern neural architectures for robust learning and inverse problem solving.

Abstract

The training of deep neural networks predominantly relies on a combination of gradient-based optimisation and back-propagation for the computation of the gradient. While incredibly successful, this approach faces challenges such as vanishing or exploding gradients, difficulties with non-smooth activations, and an inherently sequential structure that limits parallelisation. Lifted training methods offer an alternative by reformulating the nested optimisation problem into a higher-dimensional, constrained optimisation problem where the constraints are no longer enforced directly but penalised with penalty terms. This chapter introduces a unified framework that encapsulates various lifted training strategies, including the Method of Auxiliary Coordinates, Fenchel Lifted Networks, and Lifted Bregman Training, and demonstrates how diverse architectures, such as Multi-Layer Perceptrons, Residual Neural Networks, and Proximal Neural Networks fit within this structure. By leveraging tools from convex optimisation, particularly Bregman distances, the framework facilitates distributed optimisation, accommodates non-differentiable proximal activations, and can improve the conditioning of the training landscape. We discuss the implementation of these methods using block-coordinate descent strategies, including deterministic implementations enhanced by accelerated and adaptive optimisation techniques, as well as implicit stochastic gradient methods. Furthermore, we explore the application of this framework to inverse problems, detailing methodologies for both the training of specialised networks (e.g., unrolled architectures) and the stable inversion of pre-trained networks. Numerical results on standard imaging tasks validate the effectiveness and stability of the lifted Bregman approach compared to conventional training, particularly for architectures employing proximal activations.

Paper Structure

This paper contains 37 sections, 104 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Left: Lifted Bregman Objective decay curves for different optimisers at the gradient step. Left: PSNR curves for the reconstructed images during training. (Both plotted every 100 steps.)
  • Figure 2: (Deblurring $\ell_2$-error and PSNR curves) Left: $\ell_2$ reconstruction error decay tracked along training steps. Right: PSNR value tracked along training steps. (Results are shown for $\lambda = 0.02$ and $\lambda = 0.2$, respectively.)
  • Figure 3: (Denoising $\ell_2$-error and PSNR curves) Left: $\ell_2$ reconstruction error decay tracked along training steps. Right: PSNR value tracked along training steps. (Results are shown for $\lambda = 0.02$ and $\lambda = 0.2$, respectively.)
  • Figure 4: (Inpainting $\ell_2$-error and PSNR curves) Left: $\ell_2$ reconstruction error decay tracked along training steps. Right: PSNR value tracked along training steps. (Results are shown for $\lambda = 0.02$ and $\lambda = 0.2$, respectively.)
  • Figure 5: (Deblurred image visualisations) Left: Deblurred sample images from the training dataset and error maps for both strategies. Right: Deblurred sample images from the validation dataset and error maps for both strategies. (Results are shown for $\lambda = 0.02$ and $\lambda = 0.2$, respectively.)
  • ...and 5 more figures

Theorems & Definitions (6)

  • Remark 1: Conventional Training
  • Example 1: MAQ-QP for MLP
  • Example 2: Classical Lifted for MLP
  • Example 3: Fenchel Lifted for MLP
  • Example 4: Lifted Bregman for MLP
  • Remark 2: Connection to Fenchel Lifted Networks