Faster Weighted and Unweighted Tree Edit Distance and APSP Equivalence

Jakob Nogler; Adam Polak; Barna Saha; Virginia Vassilevska Williams; Yinzhan Xu; Christopher Ye

Faster Weighted and Unweighted Tree Edit Distance and APSP Equivalence

Jakob Nogler, Adam Polak, Barna Saha, Virginia Vassilevska Williams, Yinzhan Xu, Christopher Ye

TL;DR

The paper resolves the long-standing question of whether tree edit distance (TED) is fine-grained equivalent to All-Pairs Shortest Paths (APSP) by presenting a tight reduction from TED (and its forest variants) to Min-Plus matrix multiplication, thereby achieving subcubic TED times in conjunction with Williams’ APSP algorithm, namely $n^3/2^{\Omega(\sqrt{\log n})}$. It introduces a unified alignment-graph framework that maps TED to border-to-border distances in forest alignment graphs and leverages structured min-plus products (MonMUL) to obtain $\tilde{O}(n^{(3+\omega)/2})$ time for unweighted TED. Central to the approach are Spine Edit Distance (SED) and Forest Edit Distance (FED) as intermediate problems, with tight reductions that connect SED/FED to APSP, including unbalanced instances, and comprehensive divide-and-conquer strategies (including DISED/UDISED) that preserve subquadratic factors. The results unify TED with APSP in the fine-grained landscape and deliver the fastest known algorithms for unweighted TED, while also clarifying the distinct computational nature of unweighted vs. weighted TED. Collectively, the work advances the understanding of TED’s complexity and provides practical subcubic algorithms grounded in state-of-the-art min-plus product techniques, with broad implications for related string/structure similarity problems.

Abstract

The tree edit distance (TED) between two rooted ordered trees with $n$ nodes labeled from an alphabet $Σ$ is the minimum cost of transforming one tree into the other by a sequence of valid operations consisting of insertions, deletions and relabeling of nodes. The tree edit distance is a well-known generalization of string edit distance and has been studied since the 1970s. Years of steady improvements have led to an $O(n^3)$ algorithm [DMRW 2010]. Fine-grained complexity casts light onto the hardness of TED showing that a truly subcubic time algorithm for TED implies a truly subcubic time algorithm for All-Pairs Shortest Paths (APSP) [BGMW 2020]. Therefore, under the popular APSP hypothesis, a truly subcubic time algorithm for TED cannot exist. However, unlike many problems in fine-grained complexity for which conditional hardness based on APSP also comes with equivalence to APSP, whether TED can be reduced to APSP has remained unknown. In this paper, we resolve this. Not only we show that TED is fine-grained equivalent to APSP, our reduction is tight enough, so that combined with the fastest APSP algorithm to-date [Williams 2018] it gives the first ever subcubic time algorithm for TED running in $n^3/2^{Ω(\sqrt{\log{n}})}$ time. We also consider the unweighted tree edit distance problem in which the cost of each edit is one. For unweighted TED, a truly subcubic algorithm is known due to Mao [Mao 2022], later improved slightly by Dürr [Dürr 2023] to run in $O(n^{2.9148})$. Their algorithm uses bounded monotone min-plus product as a crucial subroutine, and the best running time for this product is $\tilde{O}(n^{\frac{3+ω}{2}})\leq O(n^{2.6857})$ (where $ω$ is the exponent of fast matrix multiplication). In this work, we close this gap and give an algorithm for unweighted TED that runs in $\tilde{O}(n^{\frac{3+ω}{2}})$ time.

Faster Weighted and Unweighted Tree Edit Distance and APSP Equivalence

TL;DR

. It introduces a unified alignment-graph framework that maps TED to border-to-border distances in forest alignment graphs and leverages structured min-plus products (MonMUL) to obtain

time for unweighted TED. Central to the approach are Spine Edit Distance (SED) and Forest Edit Distance (FED) as intermediate problems, with tight reductions that connect SED/FED to APSP, including unbalanced instances, and comprehensive divide-and-conquer strategies (including DISED/UDISED) that preserve subquadratic factors. The results unify TED with APSP in the fine-grained landscape and deliver the fastest known algorithms for unweighted TED, while also clarifying the distinct computational nature of unweighted vs. weighted TED. Collectively, the work advances the understanding of TED’s complexity and provides practical subcubic algorithms grounded in state-of-the-art min-plus product techniques, with broad implications for related string/structure similarity problems.

Abstract

The tree edit distance (TED) between two rooted ordered trees with

nodes labeled from an alphabet

is the minimum cost of transforming one tree into the other by a sequence of valid operations consisting of insertions, deletions and relabeling of nodes. The tree edit distance is a well-known generalization of string edit distance and has been studied since the 1970s. Years of steady improvements have led to an

algorithm [DMRW 2010]. Fine-grained complexity casts light onto the hardness of TED showing that a truly subcubic time algorithm for TED implies a truly subcubic time algorithm for All-Pairs Shortest Paths (APSP) [BGMW 2020]. Therefore, under the popular APSP hypothesis, a truly subcubic time algorithm for TED cannot exist. However, unlike many problems in fine-grained complexity for which conditional hardness based on APSP also comes with equivalence to APSP, whether TED can be reduced to APSP has remained unknown. In this paper, we resolve this. Not only we show that TED is fine-grained equivalent to APSP, our reduction is tight enough, so that combined with the fastest APSP algorithm to-date [Williams 2018] it gives the first ever subcubic time algorithm for TED running in

time. We also consider the unweighted tree edit distance problem in which the cost of each edit is one. For unweighted TED, a truly subcubic algorithm is known due to Mao [Mao 2022], later improved slightly by Dürr [Dürr 2023] to run in

. Their algorithm uses bounded monotone min-plus product as a crucial subroutine, and the best running time for this product is

(where

is the exponent of fast matrix multiplication). In this work, we close this gap and give an algorithm for unweighted TED that runs in

time.

Faster Weighted and Unweighted Tree Edit Distance and APSP Equivalence

TL;DR

Abstract

Faster Weighted and Unweighted Tree Edit Distance and APSP Equivalence

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (68)