Table of Contents
Fetching ...

Automated Concatenation of Embeddings for Structured Prediction

Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

TL;DR

ACE automates the selection of embedding concatenations for structured prediction by reframing the problem as a NAS‑style search guided by a reinforcement‑learning controller. A compact search space with binary embedding masks and a history‑aware reward enables efficient discovery of strong word representations without retraining from scratch. Empirical results across six tasks and 21 datasets show ACE consistently outperforms baselines and, when combined with fine‑tuned embeddings, achieves state‑of‑the‑art accuracy. The approach demonstrates practical applicability, reducing computational cost while delivering strong performance gains, and offers insights into when specific embeddings contribute most in syntactic vs. semantic tasks and in monolingual vs. multilingual settings.

Abstract

Pretrained contextualized embeddings are powerful word representations for structured prediction tasks. Recent work found that better word representations can be obtained by concatenating different types of embeddings. However, the selection of embeddings to form the best concatenated representation usually varies depending on the task and the collection of candidate embeddings, and the ever-increasing number of embedding types makes it a more difficult problem. In this paper, we propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks, based on a formulation inspired by recent progress on neural architecture search. Specifically, a controller alternately samples a concatenation of embeddings, according to its current belief of the effectiveness of individual embedding types in consideration for a task, and updates the belief based on a reward. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model, which is fed with the sampled concatenation as input and trained on a task dataset. Empirical results on 6 tasks and 21 datasets show that our approach outperforms strong baselines and achieves state-of-the-art performance with fine-tuned embeddings in all the evaluations.

Automated Concatenation of Embeddings for Structured Prediction

TL;DR

ACE automates the selection of embedding concatenations for structured prediction by reframing the problem as a NAS‑style search guided by a reinforcement‑learning controller. A compact search space with binary embedding masks and a history‑aware reward enables efficient discovery of strong word representations without retraining from scratch. Empirical results across six tasks and 21 datasets show ACE consistently outperforms baselines and, when combined with fine‑tuned embeddings, achieves state‑of‑the‑art accuracy. The approach demonstrates practical applicability, reducing computational cost while delivering strong performance gains, and offers insights into when specific embeddings contribute most in syntactic vs. semantic tasks and in monolingual vs. multilingual settings.

Abstract

Pretrained contextualized embeddings are powerful word representations for structured prediction tasks. Recent work found that better word representations can be obtained by concatenating different types of embeddings. However, the selection of embeddings to form the best concatenated representation usually varies depending on the task and the collection of candidate embeddings, and the ever-increasing number of embedding types makes it a more difficult problem. In this paper, we propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks, based on a formulation inspired by recent progress on neural architecture search. Specifically, a controller alternately samples a concatenation of embeddings, according to its current belief of the effectiveness of individual embedding types in consideration for a task, and updates the belief based on a reward. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model, which is fed with the sampled concatenation as input and trained on a task dataset. Empirical results on 6 tasks and 21 datasets show that our approach outperforms strong baselines and achieves state-of-the-art performance with fine-tuned embeddings in all the evaluations.

Paper Structure

This paper contains 32 sections, 10 equations, 2 figures, 13 tables.

Figures (2)

  • Figure 1: The main paradigm of our approach is shown in the middle, where an example of reward function is represented in the left and an example of a concatenation action is shown in the right.
  • Figure 2: Comparing the efficiency of random search ( Random) and ACE. The x-axis is the number of time steps. The left y-axis is the averaged best validation accuracy on CoNLL English NER dataset. The right y-axis is the averaged validation accuracy of the current selection.