Table of Contents
Fetching ...

Auto-Unrolled Proximal Gradient Descent: An AutoML Approach to Interpretable Waveform Optimization

Ahmet Kaplan

Abstract

This study explores the combination of automated machine learning (AutoML) with model-based deep unfolding (DU) for optimizing wireless beamforming and waveforms. We convert the iterative proximal gradient descent (PGD) algorithm into a deep neural network, wherein the parameters of each layer are learned instead of being predetermined. Additionally, we enhance the architecture by incorporating a hybrid layer that performs a learnable linear gradient transformation prior to the proximal projection. By utilizing AutoGluon with a tree-structured parzen estimator (TPE) for hyperparameter optimization (HPO) across an expanded search space, which includes network depth, step-size initialization, optimizer, learning rate scheduler, layer type, and post-gradient activation, the proposed auto-unrolled PGD (Auto-PGD) achieves 98.8% of the spectral efficiency of a traditional 200-iteration PGD solver using only five unrolled layers, while requiring only 100 training samples. We also address a gradient normalization issue to ensure consistent performance during training and evaluation, and we illustrate per-layer sum-rate logging as a tool for transparency. These contributions highlight a notable reduction in the amount of training data and inference cost required, while maintaining high interpretability compared to conventional black-box architectures.

Auto-Unrolled Proximal Gradient Descent: An AutoML Approach to Interpretable Waveform Optimization

Abstract

This study explores the combination of automated machine learning (AutoML) with model-based deep unfolding (DU) for optimizing wireless beamforming and waveforms. We convert the iterative proximal gradient descent (PGD) algorithm into a deep neural network, wherein the parameters of each layer are learned instead of being predetermined. Additionally, we enhance the architecture by incorporating a hybrid layer that performs a learnable linear gradient transformation prior to the proximal projection. By utilizing AutoGluon with a tree-structured parzen estimator (TPE) for hyperparameter optimization (HPO) across an expanded search space, which includes network depth, step-size initialization, optimizer, learning rate scheduler, layer type, and post-gradient activation, the proposed auto-unrolled PGD (Auto-PGD) achieves 98.8% of the spectral efficiency of a traditional 200-iteration PGD solver using only five unrolled layers, while requiring only 100 training samples. We also address a gradient normalization issue to ensure consistent performance during training and evaluation, and we illustrate per-layer sum-rate logging as a tool for transparency. These contributions highlight a notable reduction in the amount of training data and inference cost required, while maintaining high interpretability compared to conventional black-box architectures.
Paper Structure (25 sections, 5 equations, 4 figures, 3 tables)

This paper contains 25 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Abstract workflow of Auto-Unrolled Proximal Gradient Descent: classical PGD is transformed into a deep-unfolded network, AutoGluon automates architecture search and hyperparameter optimization, and the system outputs optimized beamforming vectors for wireless waveform optimization.
  • Figure 2: Sum-rate vs. training set size for all methods. Auto-PGD peaks at $N=10^3$ ($14.63$ bits/s/Hz, within $1.2\%$ of Classical PGD) and maintains strong performance down to $N=10^2$, while black-box baselines plateau well below the PGD-based methods.
  • Figure 3: AutoGluon HPO search history across 50 trials for each training size. Each point represents a candidate architecture evaluated during the search; the best configuration per size is highlighted.
  • Figure 4: Training loss curves over 200 epochs for all learned methods at $N=100$ and $N=1000$. Auto-PGD converges faster and to a lower loss than PGD-Net and the black-box baselines in both data regimes.