Auxiliary Tasks to Boost Biaffine Semantic Dependency Parsing

Marie Candito

Auxiliary Tasks to Boost Biaffine Semantic Dependency Parsing

Marie Candito

TL;DR

The paper tackles the lack of inter-arc interdependence in a biaffine SDP parser by introducing auxiliary tasks that predict token-level properties such as the number of heads and incoming label sets. Through multi-task learning with uncertainty-weighted losses and stack propagation, these tasks influence arc and label scoring without sacrificing $O(n^2)$ complexity. Experiments on English SemEval2015 Task 18 and French deep syntactic graphs show modest but statistically significant improvements, with consistent gains across in-domain and out-of-domain settings. The method remains robust across languages and graph types, providing a simple, effective boost to SDP performance that can complement transformer-based representations.

Abstract

The biaffine parser of Dozat and Manning (2017) was successfully extended to semantic dependency parsing (SDP) (Dozat and Manning, 2018). Its performance on graphs is surprisingly high given that, without the constraint of producing a tree, all arcs for a given sentence are predicted independently from each other (modulo a shared representation of tokens). To circumvent such an independence of decision, while retaining the O(n^2) complexity and highly parallelizable architecture, we propose to use simple auxiliary tasks that introduce some form of interdependence between arcs. Experiments on the three English acyclic datasets of SemEval 2015 task 18 (Oepen et al., 2015), and on French deep syntactic cyclic graphs (Ribeyre et al., 2014) show modest but systematic performance gains on a near state-of-the-art baseline using transformer-based contextualized representations. This provides a simple and robust method to boost SDP performance.

Auxiliary Tasks to Boost Biaffine Semantic Dependency Parsing

TL;DR

complexity. Experiments on English SemEval2015 Task 18 and French deep syntactic graphs show modest but statistically significant improvements, with consistent gains across in-domain and out-of-domain settings. The method remains robust across languages and graph types, providing a simple, effective boost to SDP performance that can complement transformer-based representations.

Abstract

Paper Structure (19 sections, 4 equations, 2 figures, 5 tables)

This paper contains 19 sections, 4 equations, 2 figures, 5 tables.

Introduction and related work
The baseline biafine graph parser
Auxiliary tasks targeting sets of arcs
Auxiliary tasks
Combining sublosses
Stack propagation
Experiments and discussion
Datasets
Experimental protocol
Results on French deep syntactic graphs
Results on English semantic graphs
Conclusion
Training details
For all settings:
BERT$_{\text{tuned}}$ setting
...and 4 more sections

Figures (2)

Figure 1: Top: English Semantic graph in the DM format, as part of the SemEval2015-Task18 dataset oepen-etal-2015-semeval. Bottom: French Deep syntactic graph as defined by candito-etal-2014-deep.
Figure 2: Example of competition for the sequence rule of thumb. Above arcs: correct MWE analysis (rule and of attached to the last MWE component thumb, and thumb being the head of the sequence). Below arcs: incorrect compositional analysis, in which rule is the head, e.g. attached wrongly as ARG1 of good (in red).

Auxiliary Tasks to Boost Biaffine Semantic Dependency Parsing

TL;DR

Abstract

Auxiliary Tasks to Boost Biaffine Semantic Dependency Parsing

Authors

TL;DR

Abstract

Table of Contents

Figures (2)