Table of Contents
Fetching ...

Empirical Analysis for Unsupervised Universal Dependency Parse Tree Aggregation

Adithya Kulkarni, Oliver Eulenstein, Qi Li

TL;DR

The paper tackles unstable dependency parsing performance across domains and languages by developing unsupervised post-processing DTS aggregation. It reframes DTS as an edge-level binary labeling task and compares three frameworks—MST, CRH, and CIM—on 71 UD test treebanks across 49 languages. CIM, which models input parser correlations via majority voting and a learned probabilistic joint distribution, achieves the best performance, outperforming even strong LLM-based parsers and previous baselines. This provides a language- and domain-agnostic method to stabilize dependency parsing without labeled data, with potential extensions to relation labels in future work.

Abstract

Dependency parsing is an essential task in NLP, and the quality of dependency parsers is crucial for many downstream tasks. Parsers' quality often varies depending on the domain and the language involved. Therefore, it is essential to combat the issue of varying quality to achieve stable performance. In various NLP tasks, aggregation methods are used for post-processing aggregation and have been shown to combat the issue of varying quality. However, aggregation methods for post-processing aggregation have not been sufficiently studied in dependency parsing tasks. In an extensive empirical study, we compare different unsupervised post-processing aggregation methods to identify the most suitable dependency tree structure aggregation method.

Empirical Analysis for Unsupervised Universal Dependency Parse Tree Aggregation

TL;DR

The paper tackles unstable dependency parsing performance across domains and languages by developing unsupervised post-processing DTS aggregation. It reframes DTS as an edge-level binary labeling task and compares three frameworks—MST, CRH, and CIM—on 71 UD test treebanks across 49 languages. CIM, which models input parser correlations via majority voting and a learned probabilistic joint distribution, achieves the best performance, outperforming even strong LLM-based parsers and previous baselines. This provides a language- and domain-agnostic method to stabilize dependency parsing without labeled data, with potential extensions to relation labels in future work.

Abstract

Dependency parsing is an essential task in NLP, and the quality of dependency parsers is crucial for many downstream tasks. Parsers' quality often varies depending on the domain and the language involved. Therefore, it is essential to combat the issue of varying quality to achieve stable performance. In various NLP tasks, aggregation methods are used for post-processing aggregation and have been shown to combat the issue of varying quality. However, aggregation methods for post-processing aggregation have not been sufficiently studied in dependency parsing tasks. In an extensive empirical study, we compare different unsupervised post-processing aggregation methods to identify the most suitable dependency tree structure aggregation method.
Paper Structure (21 sections, 8 equations, 3 figures, 5 tables)

This paper contains 21 sections, 8 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Difference between CIM and ensemble baselines
  • Figure 2: Difference between CIM and non-ensemble baselines
  • Figure 3: Difference between CIM and LLM-based baselines