Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

Yu Fu; Jie He; Yifan Yang; Qun Liu; Deyi Xiong

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

Yu Fu, Jie He, Yifan Yang, Qun Liu, Deyi Xiong

TL;DR

Meta-RTL introduces a reinforcement-driven, target-aware weighting mechanism for multi-source meta-transfer learning to bolster low-resource commonsense reasoning. By framing source-task weighting as a REINFORCE-guided, long-term decision process implemented with an LSTM policy, it dynamically prioritizes source tasks based on their contribution to the target task across meta-training iterations. The approach combines a PLM-based commonsense reasoning backbone with Reptile-style meta-learning and a target-aware transfer phase, achieving consistent improvements over strong baselines on three benchmarks and showing robustness in extremely low-resource settings. The results highlight the practical impact of dynamically weighting cross-task knowledge transfer for improving generalization when target data are scarce.

Abstract

Meta learning has been widely used to exploit rich-resource source tasks to improve the performance of low-resource target tasks. Unfortunately, most existing meta learning approaches treat different source tasks equally, ignoring the relatedness of source tasks to the target task in knowledge transfer. To mitigate this issue, we propose a reinforcement-based multi-source meta-transfer learning framework (Meta-RTL) for low-resource commonsense reasoning. In this framework, we present a reinforcement-based approach to dynamically estimating source task weights that measure the contribution of the corresponding tasks to the target task in the meta-transfer learning. The differences between the general loss of the meta model and task-specific losses of source-specific temporal meta models on sampled target data are fed into the policy network of the reinforcement learning module as rewards. The policy network is built upon LSTMs that capture long-term dependencies on source task weight estimation across meta learning iterations. We evaluate the proposed Meta-RTL using both BERT and ALBERT as the backbone of the meta model on three commonsense reasoning benchmark datasets. Experimental results demonstrate that Meta-RTL substantially outperforms strong baselines and previous task selection strategies and achieves larger improvements on extremely low-resource settings.

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

TL;DR

Abstract

Paper Structure (25 sections, 10 equations, 3 figures, 7 tables)

This paper contains 25 sections, 10 equations, 3 figures, 7 tables.

Introduction
Related Work
Meta Learning
Commonsense Reasoning and Datasets
Meta-RTL
PLM-Based Commonsense Reasoning Model
Meta-Transfer Learning Algorithm
Meta Learning over Multiple Source Tasks
Transfer Learning to the Target Task
Reinforcement-Based Target-Aware Weight Estimation Strategy
Experiments
Main Results
Evaluation with Different Meta-Learning Algorithms
Ablation Study on the Weight Estimation Approach
Evaluation on Extremely Low-Resource Commonsense Reasoning
...and 10 more sections

Figures (3)

Figure 1: Illustration of Meta-RTL. An LSTM-based policy network is used to dynamically estimate target-aware weights for source tasks. The estimated weights are explored to update temporal meta models into the meta model in the meta-transfer learning algorithm. The loss differences between the meta model and temporal meta models (source task-specific) on the sampled target task data are fed into the policy network as rewards.
Figure 2: Meta-Transfer Learning Algorithm
Figure 3: Comparison results of our model vs. the transferability-based method on Creak. "C": CommonseseQA. "R": RiddleSense. "W": Winogrande. "Co": Com2sense.

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

TL;DR

Abstract

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)