Table of Contents
Fetching ...

LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions

Hadi Askari, Shivanshu Gupta, Fei Wang, Anshuman Chhabra, Muhao Chen

TL;DR

LayerIF addresses the problem of variable training quality across layers in pretrained LLMs by introducing a data-driven, layer-specific Influence Function framework. It derives per-layer influence scores $I^{(l)}(z_i)$ and aggregated positives $S^{(l)}$ to quantify layer quality, then maps these scores to practical adaptations: non-uniform LoRA-MoE expert allocation and layer-wise pruning budgets. Across Mistral-7b-v0.1 and Gemma-7b, LayerIF yields consistent improvements over strong baselines in both MoE routing and pruning tasks, demonstrating dataset-specific layer specialization and model-agnostic applicability. The approach provides a principled, data-aware alternative to model-centric heuristics, with potential to improve efficiency and interpretability in large-scale deployment of LLMs.

Abstract

Pretrained Large Language Models (LLMs) achieve strong performance across a wide range of tasks, yet exhibit substantial variability in the various layers' training quality with respect to specific downstream applications, limiting their downstream performance. It is therefore critical to estimate layer-wise training quality in a manner that accounts for both model architecture and training data. However, existing approaches predominantly rely on model-centric heuristics (such as spectral statistics, outlier detection, or uniform allocation) while overlooking the influence of data. To address these limitations, we propose LayerIF, a data-driven framework that leverages Influence Functions to quantify the training quality of individual layers in a principled and task-sensitive manner. By isolating each layer's gradients and measuring the sensitivity of the validation loss to training examples by computing layer-wise influences, we derive data-driven estimates of layer importance. Notably, our method produces task-specific layer importance estimates for the same LLM, revealing how layers specialize for different test-time evaluation tasks. We demonstrate the utility of our scores by leveraging them for two downstream applications: (a) expert allocation in LoRA-MoE architectures and (b) layer-wise sparsity distribution for LLM pruning. Experiments across multiple LLM architectures demonstrate that our model-agnostic, influence-guided allocation leads to consistent gains in task performance.

LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions

TL;DR

LayerIF addresses the problem of variable training quality across layers in pretrained LLMs by introducing a data-driven, layer-specific Influence Function framework. It derives per-layer influence scores and aggregated positives to quantify layer quality, then maps these scores to practical adaptations: non-uniform LoRA-MoE expert allocation and layer-wise pruning budgets. Across Mistral-7b-v0.1 and Gemma-7b, LayerIF yields consistent improvements over strong baselines in both MoE routing and pruning tasks, demonstrating dataset-specific layer specialization and model-agnostic applicability. The approach provides a principled, data-aware alternative to model-centric heuristics, with potential to improve efficiency and interpretability in large-scale deployment of LLMs.

Abstract

Pretrained Large Language Models (LLMs) achieve strong performance across a wide range of tasks, yet exhibit substantial variability in the various layers' training quality with respect to specific downstream applications, limiting their downstream performance. It is therefore critical to estimate layer-wise training quality in a manner that accounts for both model architecture and training data. However, existing approaches predominantly rely on model-centric heuristics (such as spectral statistics, outlier detection, or uniform allocation) while overlooking the influence of data. To address these limitations, we propose LayerIF, a data-driven framework that leverages Influence Functions to quantify the training quality of individual layers in a principled and task-sensitive manner. By isolating each layer's gradients and measuring the sensitivity of the validation loss to training examples by computing layer-wise influences, we derive data-driven estimates of layer importance. Notably, our method produces task-specific layer importance estimates for the same LLM, revealing how layers specialize for different test-time evaluation tasks. We demonstrate the utility of our scores by leveraging them for two downstream applications: (a) expert allocation in LoRA-MoE architectures and (b) layer-wise sparsity distribution for LLM pruning. Experiments across multiple LLM architectures demonstrate that our model-agnostic, influence-guided allocation leads to consistent gains in task performance.

Paper Structure

This paper contains 23 sections, 4 equations, 4 figures, 11 tables.

Figures (4)

  • Figure 1: We present an overview of the LayerIF pipeline via the green arrows. We quantify per-layer quality in a pretrained LLM using Influence Functions (IFs), demonstrating that the same model produces distinct layer-wise IF scores across different datasets, revealing dataset-specific specialization. The pipeline begins by extracting gradients at each Transformer block of the LLM, computed separately for each dataset. These gradients and the dataset are then used to estimate layer-wise IF scores that serve as data-driven proxies for layer quality. To demonstrate the utility of these scores, we consider two downstream applications: (a) the allocation of optimal experts per layer in a LoRA-MoE architecture, and (b) the computation of structured layer-wise sparsity ratios for model pruning. For each task, we apply a dedicated mapping function, to transform the raw IF scores into task-specific layer importance measures. On the contrary, pre-existing methods only use model-only information or heuristics to compute these metrics as indicated by the red arrows. Figure \ref{['fig:main']} is a conceptual illustration contrasting our method with the baselines; the shown values are illustrative, not experimental.
  • Figure 2: Comparison between Alphalora, Mola and LayerIF with varying number of total experts (80, 160, 224).
  • Figure 3: Mean accuracy across 4 sparsity levels (20%–50%) for Mistral-7b-v0.1 pruned using SparseGPT.
  • Figure 4: Heatmap showing expert allocations across Transformer layers for Mistral-7B-v0.1 at a total of 160 experts, comparing our dataset-specific approach LayerIF to AlphaLora and MoLA. Unlike AlphaLora and MoLA, which apply the same allocations to every test task, LayerIF adapts allocations based on the dataset. This is reflected in the diverse allocation patterns across the various LayerIF rows as darker shades indicate higher expert allocation to that particular layer.