Table of Contents
Fetching ...

LPZero: Language Model Zero-cost Proxy Search from Zero

Peijie Dong, Lujun Li, Xiang Liu, Zhenheng Tang, Xuebo Liu, Qiang Wang, Xiaowen Chu

TL;DR

The paper tackles the high computational cost of Neural Architecture Search by introducing LPZero, a framework that automatically designs zero-cost proxies for language models. Proxies are modeled as symbolic expressions within a unified search space, and genetic programming plus a Rule-based Pruning Strategy are used to maximize ranking fidelity to ground-truth performance. LPZero demonstrates superior ranking correlations on FlexiBERT and GPT-2 benchmarks and yields competitive, cost-efficient sub-networks for LLaMA when integrated with LoNAS. This approach offers a practical, training-free estimator for guiding NAS in large NLP models, reducing compute while preserving ranking quality and downstream performance.

Abstract

In spite of the outstanding performance, Neural Architecture Search (NAS) is criticized for massive computation. Recently, Zero-shot NAS has emerged as a promising approach by exploiting Zero-cost (ZC) proxies, which markedly reduce computational demands. Despite this, existing ZC proxies heavily rely on expert knowledge and incur significant trial-and-error costs. Particularly in NLP tasks, most existing ZC proxies fail to surpass the performance of the naive baseline. To address these challenges, we introduce a novel framework, \textbf{LPZero}, which is the first to automatically design ZC proxies for various tasks, achieving higher ranking consistency than human-designed proxies. Specifically, we model the ZC proxy as a symbolic equation and incorporate a unified proxy search space that encompasses existing ZC proxies, which are composed of a predefined set of mathematical symbols. To heuristically search for the best ZC proxy, LPZero incorporates genetic programming to find the optimal symbolic composition. We propose a \textit{Rule-based Pruning Strategy (RPS),} which preemptively eliminates unpromising proxies, thereby mitigating the risk of proxy degradation. Extensive experiments on FlexiBERT, GPT-2, and LLaMA-7B demonstrate LPZero's superior ranking ability and performance on downstream tasks compared to current approaches.

LPZero: Language Model Zero-cost Proxy Search from Zero

TL;DR

The paper tackles the high computational cost of Neural Architecture Search by introducing LPZero, a framework that automatically designs zero-cost proxies for language models. Proxies are modeled as symbolic expressions within a unified search space, and genetic programming plus a Rule-based Pruning Strategy are used to maximize ranking fidelity to ground-truth performance. LPZero demonstrates superior ranking correlations on FlexiBERT and GPT-2 benchmarks and yields competitive, cost-efficient sub-networks for LLaMA when integrated with LoNAS. This approach offers a practical, training-free estimator for guiding NAS in large NLP models, reducing compute while preserving ranking quality and downstream performance.

Abstract

In spite of the outstanding performance, Neural Architecture Search (NAS) is criticized for massive computation. Recently, Zero-shot NAS has emerged as a promising approach by exploiting Zero-cost (ZC) proxies, which markedly reduce computational demands. Despite this, existing ZC proxies heavily rely on expert knowledge and incur significant trial-and-error costs. Particularly in NLP tasks, most existing ZC proxies fail to surpass the performance of the naive baseline. To address these challenges, we introduce a novel framework, \textbf{LPZero}, which is the first to automatically design ZC proxies for various tasks, achieving higher ranking consistency than human-designed proxies. Specifically, we model the ZC proxy as a symbolic equation and incorporate a unified proxy search space that encompasses existing ZC proxies, which are composed of a predefined set of mathematical symbols. To heuristically search for the best ZC proxy, LPZero incorporates genetic programming to find the optimal symbolic composition. We propose a \textit{Rule-based Pruning Strategy (RPS),} which preemptively eliminates unpromising proxies, thereby mitigating the risk of proxy degradation. Extensive experiments on FlexiBERT, GPT-2, and LLaMA-7B demonstrate LPZero's superior ranking ability and performance on downstream tasks compared to current approaches.
Paper Structure (33 sections, 18 equations, 12 figures, 15 tables, 1 algorithm)

This paper contains 33 sections, 18 equations, 12 figures, 15 tables, 1 algorithm.

Figures (12)

  • Figure 1: Proxy Search space of LPZero framework.
  • Figure 2: Genetic programming process of LPZero.
  • Figure 3: Illustration of Crossover and Mutation.
  • Figure 4: Spearman's $\rho$ and Kendall's $\tau$ Correlation of training-free proxies with GLUE Score across 500 architectures randomly sampled from FlexiBERT benchmark.
  • Figure 5: Performance comparison of evolution search with and without the Rule-based Pruning Strategy (RPS) and random search across iterations.
  • ...and 7 more figures