Table of Contents
Fetching ...

An Empirical Study of Causal Relation Extraction Transfer: Design and Data

Sydney Anuyah, Jack Vanschaik, Palak Jain, Sawyer Lehman, Sunandan Chakraborty

TL;DR

This work addresses open domain causal relation extraction by evaluating cross dataset transfer across six datasets using neural models. It finds that BioBERT-BiGRU offers robust generalization, and introduces $F1_{phrase}$ to emphasize noun phrase localization during transfer. The study shows that data augmentation across diverse domains and annotation styles significantly enhances transfer performance, and that the composition of implicit versus explicit causality in training data often outweighs mere increases in data size. These findings support building scalable open domain causal knowledge extraction systems by leveraging diverse annotated data and domain specialized embeddings.

Abstract

We conduct an empirical analysis of neural network architectures and data transfer strategies for causal relation extraction. By conducting experiments with various contextual embedding layers and architectural components, we show that a relatively straightforward BioBERT-BiGRU relation extraction model generalizes better than other architectures across varying web-based sources and annotation strategies. Furthermore, we introduce a metric for evaluating transfer performance, $F1_{phrase}$ that emphasizes noun phrase localization rather than directly matching target tags. Using this metric, we can conduct data transfer experiments, ultimately revealing that augmentation with data with varying domains and annotation styles can improve performance. Data augmentation is especially beneficial when an adequate proportion of implicitly and explicitly causal sentences are included.

An Empirical Study of Causal Relation Extraction Transfer: Design and Data

TL;DR

This work addresses open domain causal relation extraction by evaluating cross dataset transfer across six datasets using neural models. It finds that BioBERT-BiGRU offers robust generalization, and introduces to emphasize noun phrase localization during transfer. The study shows that data augmentation across diverse domains and annotation styles significantly enhances transfer performance, and that the composition of implicit versus explicit causality in training data often outweighs mere increases in data size. These findings support building scalable open domain causal knowledge extraction systems by leveraging diverse annotated data and domain specialized embeddings.

Abstract

We conduct an empirical analysis of neural network architectures and data transfer strategies for causal relation extraction. By conducting experiments with various contextual embedding layers and architectural components, we show that a relatively straightforward BioBERT-BiGRU relation extraction model generalizes better than other architectures across varying web-based sources and annotation strategies. Furthermore, we introduce a metric for evaluating transfer performance, that emphasizes noun phrase localization rather than directly matching target tags. Using this metric, we can conduct data transfer experiments, ultimately revealing that augmentation with data with varying domains and annotation styles can improve performance. Data augmentation is especially beneficial when an adequate proportion of implicitly and explicitly causal sentences are included.

Paper Structure

This paper contains 17 sections, 3 equations, 2 figures, 8 tables.

Figures (2)

  • Figure 1: The $F1_{phrase}$ metric measures a models ability to locate these noun phrases. If a candidate model were to predict "hammer" as the only cause token, the entire phrase "water hammer pressure" would be counted as entirely correct because the phrase can completely be recovered from "hammer" using the dependency tree.
  • Figure 2: Part of speech distributions for CauseNet and SemEval after dependency parsing. The difference in annotation schemes between CauseNet and SemEval is clear, as disproportionately more adjectives are fall under "C" (cause) and "E" (effect) labels in CauseNet. The $F1_{phrase}$ metric accounts for this discrepancy.