LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions
Hadi Askari, Shivanshu Gupta, Fei Wang, Anshuman Chhabra, Muhao Chen
TL;DR
LayerIF addresses the problem of variable training quality across layers in pretrained LLMs by introducing a data-driven, layer-specific Influence Function framework. It derives per-layer influence scores $I^{(l)}(z_i)$ and aggregated positives $S^{(l)}$ to quantify layer quality, then maps these scores to practical adaptations: non-uniform LoRA-MoE expert allocation and layer-wise pruning budgets. Across Mistral-7b-v0.1 and Gemma-7b, LayerIF yields consistent improvements over strong baselines in both MoE routing and pruning tasks, demonstrating dataset-specific layer specialization and model-agnostic applicability. The approach provides a principled, data-aware alternative to model-centric heuristics, with potential to improve efficiency and interpretability in large-scale deployment of LLMs.
Abstract
Pretrained Large Language Models (LLMs) achieve strong performance across a wide range of tasks, yet exhibit substantial variability in the various layers' training quality with respect to specific downstream applications, limiting their downstream performance. It is therefore critical to estimate layer-wise training quality in a manner that accounts for both model architecture and training data. However, existing approaches predominantly rely on model-centric heuristics (such as spectral statistics, outlier detection, or uniform allocation) while overlooking the influence of data. To address these limitations, we propose LayerIF, a data-driven framework that leverages Influence Functions to quantify the training quality of individual layers in a principled and task-sensitive manner. By isolating each layer's gradients and measuring the sensitivity of the validation loss to training examples by computing layer-wise influences, we derive data-driven estimates of layer importance. Notably, our method produces task-specific layer importance estimates for the same LLM, revealing how layers specialize for different test-time evaluation tasks. We demonstrate the utility of our scores by leveraging them for two downstream applications: (a) expert allocation in LoRA-MoE architectures and (b) layer-wise sparsity distribution for LLM pruning. Experiments across multiple LLM architectures demonstrate that our model-agnostic, influence-guided allocation leads to consistent gains in task performance.
