Parameter-Efficient Quality Estimation via Frozen Recursive Models

Umar Abubacar; Roman Bauer; Diptesh Kanojia

Parameter-Efficient Quality Estimation via Frozen Recursive Models

Umar Abubacar, Roman Bauer, Diptesh Kanojia

Abstract

Tiny Recursive Models (TRM) achieve strong results on reasoning tasks through iterative refinement of a shared network. We investigate whether these recursive mechanisms transfer to Quality Estimation (QE) for low-resource languages using a three-phase methodology. Experiments on $8$ language pairs on a low-resource QE dataset reveal three findings. First, TRM's recursive mechanisms do not transfer to QE. External iteration hurts performance, and internal recursion offers only narrow benefits. Next, representation quality dominates architectural choices, and lastly, frozen pretrained embeddings match fine-tuned performance while reducing trainable parameters by 37$\times$ (7M vs 262M). TRM-QE with frozen XLM-R embeddings achieves a Spearman's correlation of 0.370, matching fine-tuned variants (0.369) and outperforming an equivalent-depth standard transformer (0.336). On Hindi and Tamil, frozen TRM-QE outperforms MonoTransQuest (560M parameters) with 80$\times$ fewer trainable parameters, suggesting that weight sharing combined with frozen embeddings enables parameter efficiency for QE. We release the code publicly for further research. Code is available at https://github.com/surrey-nlp/TRMQE.

Parameter-Efficient Quality Estimation via Frozen Recursive Models

Abstract

language pairs on a low-resource QE dataset reveal three findings. First, TRM's recursive mechanisms do not transfer to QE. External iteration hurts performance, and internal recursion offers only narrow benefits. Next, representation quality dominates architectural choices, and lastly, frozen pretrained embeddings match fine-tuned performance while reducing trainable parameters by 37

(7M vs 262M). TRM-QE with frozen XLM-R embeddings achieves a Spearman's correlation of 0.370, matching fine-tuned variants (0.369) and outperforming an equivalent-depth standard transformer (0.336). On Hindi and Tamil, frozen TRM-QE outperforms MonoTransQuest (560M parameters) with 80

fewer trainable parameters, suggesting that weight sharing combined with frozen embeddings enables parameter efficiency for QE. We release the code publicly for further research. Code is available at https://github.com/surrey-nlp/TRMQE.

Paper Structure (18 sections, 3 figures, 5 tables)

This paper contains 18 sections, 3 figures, 5 tables.

Introduction
Background
QE as reasoning.
TRM's architecture
Experimental Setup
Dataset
Model Architecture
Three-Phase Methodology
Results
Phase 1: Recursion Effects
External Iteration
Internal Recursion
Phase 2: Representation Effects
Phase 3: Frozen vs Fine-tuned Embeddings
Comparison with TransQuest
...and 3 more sections

Figures (3)

Figure 1: TRM-QE architecture. Source-translation pairs are marked with special tokens and encoded by a pretrained model (frozen or fine-tuned). The TRM applies L weight-shared cycles, then the Q head outputs a quality score via sigmoid activation.
Figure 2: External iteration ablation: single-step models outperform multi-step variants across all step counts tested (1--16).
Figure 3: L-cycle ablation: performance peaks at L=4 (8 effective layers), with degradation at both shallower (L=1, L=2) and deeper (L=6) configurations.

Parameter-Efficient Quality Estimation via Frozen Recursive Models

Abstract

Parameter-Efficient Quality Estimation via Frozen Recursive Models

Authors

Abstract

Table of Contents

Figures (3)