Table of Contents
Fetching ...

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

Yu Fu, Jie He, Yifan Yang, Qun Liu, Deyi Xiong

TL;DR

Meta-RTL introduces a reinforcement-driven, target-aware weighting mechanism for multi-source meta-transfer learning to bolster low-resource commonsense reasoning. By framing source-task weighting as a REINFORCE-guided, long-term decision process implemented with an LSTM policy, it dynamically prioritizes source tasks based on their contribution to the target task across meta-training iterations. The approach combines a PLM-based commonsense reasoning backbone with Reptile-style meta-learning and a target-aware transfer phase, achieving consistent improvements over strong baselines on three benchmarks and showing robustness in extremely low-resource settings. The results highlight the practical impact of dynamically weighting cross-task knowledge transfer for improving generalization when target data are scarce.

Abstract

Meta learning has been widely used to exploit rich-resource source tasks to improve the performance of low-resource target tasks. Unfortunately, most existing meta learning approaches treat different source tasks equally, ignoring the relatedness of source tasks to the target task in knowledge transfer. To mitigate this issue, we propose a reinforcement-based multi-source meta-transfer learning framework (Meta-RTL) for low-resource commonsense reasoning. In this framework, we present a reinforcement-based approach to dynamically estimating source task weights that measure the contribution of the corresponding tasks to the target task in the meta-transfer learning. The differences between the general loss of the meta model and task-specific losses of source-specific temporal meta models on sampled target data are fed into the policy network of the reinforcement learning module as rewards. The policy network is built upon LSTMs that capture long-term dependencies on source task weight estimation across meta learning iterations. We evaluate the proposed Meta-RTL using both BERT and ALBERT as the backbone of the meta model on three commonsense reasoning benchmark datasets. Experimental results demonstrate that Meta-RTL substantially outperforms strong baselines and previous task selection strategies and achieves larger improvements on extremely low-resource settings.

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

TL;DR

Meta-RTL introduces a reinforcement-driven, target-aware weighting mechanism for multi-source meta-transfer learning to bolster low-resource commonsense reasoning. By framing source-task weighting as a REINFORCE-guided, long-term decision process implemented with an LSTM policy, it dynamically prioritizes source tasks based on their contribution to the target task across meta-training iterations. The approach combines a PLM-based commonsense reasoning backbone with Reptile-style meta-learning and a target-aware transfer phase, achieving consistent improvements over strong baselines on three benchmarks and showing robustness in extremely low-resource settings. The results highlight the practical impact of dynamically weighting cross-task knowledge transfer for improving generalization when target data are scarce.

Abstract

Meta learning has been widely used to exploit rich-resource source tasks to improve the performance of low-resource target tasks. Unfortunately, most existing meta learning approaches treat different source tasks equally, ignoring the relatedness of source tasks to the target task in knowledge transfer. To mitigate this issue, we propose a reinforcement-based multi-source meta-transfer learning framework (Meta-RTL) for low-resource commonsense reasoning. In this framework, we present a reinforcement-based approach to dynamically estimating source task weights that measure the contribution of the corresponding tasks to the target task in the meta-transfer learning. The differences between the general loss of the meta model and task-specific losses of source-specific temporal meta models on sampled target data are fed into the policy network of the reinforcement learning module as rewards. The policy network is built upon LSTMs that capture long-term dependencies on source task weight estimation across meta learning iterations. We evaluate the proposed Meta-RTL using both BERT and ALBERT as the backbone of the meta model on three commonsense reasoning benchmark datasets. Experimental results demonstrate that Meta-RTL substantially outperforms strong baselines and previous task selection strategies and achieves larger improvements on extremely low-resource settings.
Paper Structure (25 sections, 10 equations, 3 figures, 7 tables)

This paper contains 25 sections, 10 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Illustration of Meta-RTL. An LSTM-based policy network is used to dynamically estimate target-aware weights for source tasks. The estimated weights are explored to update temporal meta models into the meta model in the meta-transfer learning algorithm. The loss differences between the meta model and temporal meta models (source task-specific) on the sampled target task data are fed into the policy network as rewards.
  • Figure 2: Meta-Transfer Learning Algorithm
  • Figure 3: Comparison results of our model vs. the transferability-based method on Creak. "C": CommonseseQA. "R": RiddleSense. "W": Winogrande. "Co": Com2sense.