Table of Contents
Fetching ...

Kernel-Level Energy-Efficient Neural Architecture Search for Tabular Dataset

Hoang-Loc La, Phuong Hoai Ha

TL;DR

The paper addresses the problem of reducing energy consumption in neural networks for tabular data by introducing a kernel-level energy-predictive NAS framework. It extends the nn-Meter latency-prediction paradigm to energy estimation, accounts for parallel kernel execution on NVIDIA GPUs, and uses a weight-entanglement-based one-shot NAS across three tabular-focused search spaces (MLP, ResNet, FTTransformer) guided by a policy-gradient objective that trades off energy and accuracy. Empirical results on TabZilla datasets and real-world data show substantial energy savings (up to ~92% over conventional NAS) with comparable accuracy, across edge and desktop NVIDIA devices, while addressing practical profiling challenges. The work concludes with a plan to incorporate meta-learning to reduce hardware-specific data collection and improve adaptability to new devices and search spaces, highlighting practical implications for energy-efficient deployment of tabular neural models.

Abstract

Many studies estimate energy consumption using proxy metrics like memory usage, FLOPs, and inference latency, with the assumption that reducing these metrics will also lower energy consumption in neural networks. This paper, however, takes a different approach by introducing an energy-efficient Neural Architecture Search (NAS) method that directly focuses on identifying architectures that minimize energy consumption while maintaining acceptable accuracy. Unlike previous methods that primarily target vision and language tasks, the approach proposed here specifically addresses tabular datasets. Remarkably, the optimal architecture suggested by this method can reduce energy consumption by up to 92% compared to architectures recommended by conventional NAS.

Kernel-Level Energy-Efficient Neural Architecture Search for Tabular Dataset

TL;DR

The paper addresses the problem of reducing energy consumption in neural networks for tabular data by introducing a kernel-level energy-predictive NAS framework. It extends the nn-Meter latency-prediction paradigm to energy estimation, accounts for parallel kernel execution on NVIDIA GPUs, and uses a weight-entanglement-based one-shot NAS across three tabular-focused search spaces (MLP, ResNet, FTTransformer) guided by a policy-gradient objective that trades off energy and accuracy. Empirical results on TabZilla datasets and real-world data show substantial energy savings (up to ~92% over conventional NAS) with comparable accuracy, across edge and desktop NVIDIA devices, while addressing practical profiling challenges. The work concludes with a plan to incorporate meta-learning to reduce hardware-specific data collection and improve adaptability to new devices and search spaces, highlighting practical implications for energy-efficient deployment of tabular neural models.

Abstract

Many studies estimate energy consumption using proxy metrics like memory usage, FLOPs, and inference latency, with the assumption that reducing these metrics will also lower energy consumption in neural networks. This paper, however, takes a different approach by introducing an energy-efficient Neural Architecture Search (NAS) method that directly focuses on identifying architectures that minimize energy consumption while maintaining acceptable accuracy. Unlike previous methods that primarily target vision and language tasks, the approach proposed here specifically addresses tabular datasets. Remarkably, the optimal architecture suggested by this method can reduce energy consumption by up to 92% compared to architectures recommended by conventional NAS.

Paper Structure

This paper contains 20 sections, 2 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Kernel-based Energy Model.
  • Figure 2: Examples of parallelizable kernels from GoogLeNetszegedy2015going. The light red rectangles denote parallelizable kernels. The red rectangle denotes the newly generated kernels
  • Figure 3: Measured power consumption of common CNNs on the Jetson AGX Orin before and after adjustments, with lines ending when inference stops.
  • Figure 4: Left: The difference between the weight-entanglement supernet and weight-sharing supernet for MLP search space. Right: Overall architecture of the three supernets for tabular search spaces, showing selected components in solid lines and unselected components in dashed lines. All supernets share the same macro backbone, with differences in block configurations.
  • Figure 5: Overview of the multi-convolution network and the single merged convolution network used in our micro-benchmark.
  • ...and 1 more figures