Scaling Graph-Based Dependency Parsing with Arc Vectorization and Attention-Based Refinement
Nicolas Floquet, Joseph Le Roux, Nadi Tomeh, Thierry Charnois
TL;DR
The paper tackles scalability in graph-based dependency parsing by replacing separate arc and label scoring with an arc-centric vector representation that unifies scoring within a single network. It introduces a Transformer-based refinement over arc vectors to emulate higher-order dependencies while a filtering mechanism keeps attention memory tractable. Experiments on PTB and UD demonstrate improved accuracy and competitive speed, with state-of-the-art results on PTB and strong gains across most UD languages. This arc-vector framework enables better parameter sharing and can extend to other structured prediction tasks.
Abstract
We propose a novel architecture for graph-based dependency parsing that explicitly constructs vectors, from which both arcs and labels are scored. Our method addresses key limitations of the standard two-pipeline approach by unifying arc scoring and labeling into a single network, reducing scalability issues caused by the information bottleneck and lack of parameter sharing. Additionally, our architecture overcomes limited arc interactions with transformer layers to efficiently simulate higher-order dependencies. Experiments on PTB and UD show that our model outperforms state-of-the-art parsers in both accuracy and efficiency.
