The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Pratyusha Sharma; Jordan T. Ash; Dipendra Misra

The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Pratyusha Sharma, Jordan T. Ash, Dipendra Misra

TL;DR

The paper shows that selective, post-training rank reductions in Transformer weight matrices—especially in late-layer MLPs—can surprisingly improve reasoning and factual accuracy without additional data or training. By replacing selected matrices with their low-rank approximations, LASER acts as a denoising mechanism that suppresses noisy higher-order components while preserving useful low-order information. The approach yields substantial gains on CounterFact and related NLP benchmarks, generalizes across models and even extends to non-text domains like reinforcement learning tasks, though it can modestly worsen language modeling perplexity. These findings challenge the notion that more parameters and data are always beneficial and offer a training-free path to enhance reasoning in large language models. They also raise questions about how higher-order components encode information and why later-layer MLPs are particularly amenable to improvement, pointing to future work on model internals and cross-architecture effects.

Abstract

Transformer-based Large Language Models (LLMs) have become a fixture in modern machine learning. Correspondingly, significant resources are allocated towards research that aims to further advance this technology, typically resulting in models of increasing size that are trained on increasing amounts of data. This work, however, demonstrates the surprising result that it is often possible to significantly improve the performance of LLMs by selectively removing higher-order components of their weight matrices. This simple intervention, which we call LAyer-SElective Rank reduction (LASER), can be done on a model after training has completed, and requires no additional parameters or data. We show extensive experiments demonstrating the generality of this finding across language models and datasets, and provide in-depth analyses offering insights into both when LASER is effective and the mechanism by which it operates.

The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

TL;DR

Abstract

Paper Structure (64 sections, 10 figures, 5 tables)

This paper contains 64 sections, 10 figures, 5 tables.

Introduction
Related work
How facts are stored.
Model compression.
Low-rank approximations of weight matrices.
Model distillation and low-rank training.
Preliminaries
Maths Notation.
Transformer Architecture.
Rank-$r$ Approximation and SVD.
LAyer SElective Rank Reduction (LASER)
Experiments
GPT-J, CounterFact and PILE.
A Thorough Analysis with GPT-J on the CounterFact Dataset
Improved accuracy and robustness to paraphrases.
...and 49 more sections

Figures (10)

Figure 1: LAyer SElective Rank reduction (${\tt LASER}$) replaces a specific weight matrix $W$ of the Transformer model by its rank-$k$ approximation, $W_{\textrm{LR}}$, and observes the change in the model's behavior. We find that this rank approximation, especially for MLP weights in the later layers of the model, often offers surprising benefits to model performance.
Figure 2: The effect of rank reduction across different layer types is not uniform. Here we show the effect of rank reduction for GPT-J as studied on the CounterFact dataset. The dashed line is the um-modified network's loss. In the attention layers (key, query, value, out matrices), while it is clear matrices could be significantly rank-reduced without damaging the learned hypothesis, there is very little performance increase. However, for the multi-layer perceptron (MLP) layers, rank reduction goes from uniformly harming to improving the model's performance (around layer 20).
Figure 3: Which datapoints benefit from ${\tt LASER}$? We analyze how frequently in the training data "corrected" facts occur. GPT-J is an ideal test bed for such analysis since its training data ($\mathcal{D}_{Train}$), the PILE dataset, is publicly available. (a) For GPT-J evaluated on CounterFact ($\mathcal{D}_{QA}$) we retrieve all the datapoints in $\mathcal{D}_{Train}$ that contain a mention of both the entity of interest and the answer that correspond to each sample in $\mathcal{D}_{QA}$. (b) A plot depicting the cumulative top-10 accuracy of the model on all datapoints that occur in the training data less than or equal to the frequency indicated on the x-axis. Here we show accuracy with and without ${\tt LASER}$. (c) The largest boost in performance occurs for low-frequency samples. This bar chart displays the amount of boost offered by ${\tt LASER}$ for data binned by the frequency with which corresponding facts occur in $\mathcal{D}_{Train}$. Maximal improvements in accuracy are from datapoints that have less-frequent occurrences in training data.
Figure 4: Composing ${\tt LASER}$ operations across multiple layers further enhances model performance. Here we show how accuracy improves for using a simple composing strategy for both validation data, which was used to identify each ($\tau, \ell, \rho$), and test data.
Figure 5: (a) [Left] ${\tt LASER}$ approximates learned matrices by their lower-order components. We find that for datapoints where the model's predictions improve after ${\tt LASER}$, if we instead use the entire matrix (including higher-order components), the model often predicts only "generic" words. (a) [Right] To understand what these higher-order components encode, we approximate the learned weight matrix with the higher-order components instead. We find that these higher-order components sometimes encode the correct semantic type of the answer but the incorrect response. (b) Analytically, computing the semantic similarity (cosine distance between the true answer and the answers generated by the bottom k% of the singular vectors) shows that on average the answer computed by the higher-order components is more similar to the real answer. (c) Shows some examples from the dataset and the corresponding answers computed by the top fraction and bottom fraction of the components.
...and 5 more figures

The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

TL;DR

Abstract

The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Authors

TL;DR

Abstract

Table of Contents

Figures (10)