Table of Contents
Fetching ...

Representation Learning for Weakly Supervised Relation Extraction

Zhuang Li

TL;DR

This work tackles relation extraction under scarce labeled data by introducing unsupervised pre-training to learn distributed text representations that capture syntactic-semantic patterns in relation expressions. The authors develop a novel Shortest Dependency Path LSTM (Tree LSTM) based approach that combines these learned representations with hand-crafted features in a unified classifier, along with entity and word-prediction pre-training losses. Experimental results on Stanford and Google datasets show that pre-training improves macro-precision, macro-recall, and macro-F1—especially for low-frequency relations—though accuracy can be dominated by high-frequency 'no_relation' cases. The study demonstrates that carefully designed pre-training and fine-tuning strategies can significantly enhance relation extraction performance in low-resource settings, and identifies key factors such as which losses to use and which vectors to update. Overall, the work advances weakly supervised and representation-learning approaches for RE with practical gains and clear avenues for future exploration.

Abstract

Recent years have seen rapid development in Information Extraction, as well as its subtask, Relation Extraction. Relation Extraction is able to detect semantic relations between entities in sentences. Currently, many efficient approaches have been applied to relation extraction tasks. Supervised learning approaches especially have good performance. However, there are still many difficult challenges. One of the most serious problems is that manually labeled data is difficult to acquire. In most cases, limited data for supervised approaches equals lousy performance. Thus here, under the situation with only limited training data, we focus on how to improve the performance of our supervised baseline system with unsupervised pre-training. Feature is one of the key components in improving the supervised approaches. Traditional approaches usually apply hand-crafted features, which require expert knowledge and expensive human labor. However, this type of feature might suffer from data sparsity: when the training set size is small, the model parameters might be poorly estimated. In this thesis, we present several novel unsupervised pre-training models to learn the distributed text representation features, which are encoded with rich syntactic-semantic patterns of relation expressions. The experiments have demonstrated that this type of feature, combine with the traditional hand-crafted features, could improve the performance of the logistic classification model for relation extraction, especially on the classification of relations with only minor training instances.

Representation Learning for Weakly Supervised Relation Extraction

TL;DR

This work tackles relation extraction under scarce labeled data by introducing unsupervised pre-training to learn distributed text representations that capture syntactic-semantic patterns in relation expressions. The authors develop a novel Shortest Dependency Path LSTM (Tree LSTM) based approach that combines these learned representations with hand-crafted features in a unified classifier, along with entity and word-prediction pre-training losses. Experimental results on Stanford and Google datasets show that pre-training improves macro-precision, macro-recall, and macro-F1—especially for low-frequency relations—though accuracy can be dominated by high-frequency 'no_relation' cases. The study demonstrates that carefully designed pre-training and fine-tuning strategies can significantly enhance relation extraction performance in low-resource settings, and identifies key factors such as which losses to use and which vectors to update. Overall, the work advances weakly supervised and representation-learning approaches for RE with practical gains and clear avenues for future exploration.

Abstract

Recent years have seen rapid development in Information Extraction, as well as its subtask, Relation Extraction. Relation Extraction is able to detect semantic relations between entities in sentences. Currently, many efficient approaches have been applied to relation extraction tasks. Supervised learning approaches especially have good performance. However, there are still many difficult challenges. One of the most serious problems is that manually labeled data is difficult to acquire. In most cases, limited data for supervised approaches equals lousy performance. Thus here, under the situation with only limited training data, we focus on how to improve the performance of our supervised baseline system with unsupervised pre-training. Feature is one of the key components in improving the supervised approaches. Traditional approaches usually apply hand-crafted features, which require expert knowledge and expensive human labor. However, this type of feature might suffer from data sparsity: when the training set size is small, the model parameters might be poorly estimated. In this thesis, we present several novel unsupervised pre-training models to learn the distributed text representation features, which are encoded with rich syntactic-semantic patterns of relation expressions. The experiments have demonstrated that this type of feature, combine with the traditional hand-crafted features, could improve the performance of the logistic classification model for relation extraction, especially on the classification of relations with only minor training instances.

Paper Structure

This paper contains 120 sections, 32 equations, 20 figures, 11 tables, 1 algorithm.

Figures (20)

  • Figure 1: A snippet of freebase dump.
  • Figure 2: A dependency tree of a sentence generated by Stanford Core NLP online demo.
  • Figure 3: Left: Input processing part of a recurrent neural network. Right: Same seen but an unfolded graphBengio-et-al-2015-Book.
  • Figure 4: A simple LSTM cell which has input, output and forget gatehochreiter1997long.
  • Figure 5: Architecture of Baseline System with Hand-crafted Features.
  • ...and 15 more figures

Theorems & Definitions (2)

  • Definition 3.1.1
  • Definition 3.1.2