Table of Contents
Fetching ...

Revisiting Tree Isomorphism: An Algorithmic Bric-à-Brac

Florian Ingels

TL;DR

The paper addresses the problem of tree isomorphism for unordered rooted trees by revisiting and clarifying the AHU algorithm and offering practical variants. It introduces two sorting-based and one prime-multiplication approach, and extends these ideas to DAG compression via global color classes. Despite worst-case superlinear complexity, empirical tests on trees up to $10^6$ nodes show competitive performance with the original method, with the timsort-based variant often easiest to implement and fastest in practice. The results also show that prime multiplication can match sorting performance in Python, suggesting a viable alternative for large-scale tree analysis and DAG compression.

Abstract

The Aho, Hopcroft and Ullman (AHU) algorithm has been the state of the art since the 1970s for determining in linear time whether two unordered rooted trees are isomorphic or not. However, it has been criticized (by Campbell and Radford) for the way it is written, which requires several (re)readings to be understood, and does not facilitate its analysis. In this article, we propose a different, more intuitive formulation of the algorithm, as well as three propositions of implementation, two using sorting algorithms and one using prime multiplication. Although none of these three variants admits linear complexity, we show that in practice two of them are competitive with the original algorithm, while being straightforward to implement. Surprisingly, the algorithm that uses multiplications of prime numbers (which are also be generated during the execution) is competitive with the fastest variants using sorts, despite having a worst theoretical complexity. We also adapt our formulation of AHU to tackle to compression of trees in directed acyclic graphs (DAGs). This algorithm is also available in three versions, two with sorting and one with prime number multiplication. Our experiments are carried out on trees of size at most $10^6$, consistent with the actual datasets we are aware of, and done in Python with the library treex, dedicated to tree algorithms.

Revisiting Tree Isomorphism: An Algorithmic Bric-à-Brac

TL;DR

The paper addresses the problem of tree isomorphism for unordered rooted trees by revisiting and clarifying the AHU algorithm and offering practical variants. It introduces two sorting-based and one prime-multiplication approach, and extends these ideas to DAG compression via global color classes. Despite worst-case superlinear complexity, empirical tests on trees up to nodes show competitive performance with the original method, with the timsort-based variant often easiest to implement and fastest in practice. The results also show that prime multiplication can match sorting performance in Python, suggesting a viable alternative for large-scale tree analysis and DAG compression.

Abstract

The Aho, Hopcroft and Ullman (AHU) algorithm has been the state of the art since the 1970s for determining in linear time whether two unordered rooted trees are isomorphic or not. However, it has been criticized (by Campbell and Radford) for the way it is written, which requires several (re)readings to be understood, and does not facilitate its analysis. In this article, we propose a different, more intuitive formulation of the algorithm, as well as three propositions of implementation, two using sorting algorithms and one using prime multiplication. Although none of these three variants admits linear complexity, we show that in practice two of them are competitive with the original algorithm, while being straightforward to implement. Surprisingly, the algorithm that uses multiplications of prime numbers (which are also be generated during the execution) is competitive with the fastest variants using sorts, despite having a worst theoretical complexity. We also adapt our formulation of AHU to tackle to compression of trees in directed acyclic graphs (DAGs). This algorithm is also available in three versions, two with sorting and one with prime number multiplication. Our experiments are carried out on trees of size at most , consistent with the actual datasets we are aware of, and done in Python with the library treex, dedicated to tree algorithms.
Paper Structure (19 sections, 13 theorems, 9 equations, 6 figures, 2 tables, 4 algorithms)

This paper contains 19 sections, 13 theorems, 9 equations, 6 figures, 2 tables, 4 algorithms.

Key Result

theorem 1

AHU algorithm runs in $O(n)$ where $n=\#T_1=\#T_2$.

Figures (6)

  • Figure 1: Two isomorphic trees.
  • Figure 2: Assigning colours to nodes in AHU algorithm. $\star$: We could have used colour because colours are assigned level by level and not globally. We have chosen to use another colour for the sake of clarity. In this example, the colours correspond exactly to the equivalence classes of the nodes.
  • Figure 3: A tree $T$ (left) and its DAG compression $\mathop{\mathrm{\mathfrak{R}}}\nolimits(T)$ (right). Nodes are colored according to their equivalence class under $\simeq$.
  • Figure 4: Computation times (in log scale) for tree isomorphism using different algorithms, according to the size of the trees, with 100 replicates for each size.
  • Figure 5: Computation times (in log scale) for DAG compression using different algorithms, according to the size of the trees, with 100 replicates for each size.
  • ...and 1 more figures

Theorems & Definitions (20)

  • definition 1
  • theorem 1: Aho, Hopcroft & Ullman
  • remark 1
  • proposition 1
  • lemma 1
  • remark 2
  • remark 3
  • proposition 2
  • lemma 2
  • theorem 2
  • ...and 10 more