Table of Contents
Fetching ...

Hierarchical Linkage Clustering Beyond Binary Trees and Ultrametrics

Maximilien Dreveton, Matthias Grossglauser, Daichi Kuroda, Patrick Thiran

TL;DR

This work tackles fundamental limitations of traditional hierarchical clustering—binary constraints, forced hierarchies when none exist, and linkage sensitivity—by introducing the finest valid hierarchy $T_*(\mathcal{X},s)$, the unique greatest element in the set of hierarchies compatible with a similarity function $s$. It proposes a practical two-step procedure that first builds a binary tree using a linkage rule and then prunes invalid splits to enforce hierarchy validity; under two necessary and sufficient conditions on the linkage update function $f$, the pruning recovers $T_*(\mathcal{X},s)$ exactly, with all such linkages yielding the same pruned hierarchy. The framework generalizes the ultrametric-dendrogram correspondence to arbitrary similarities and clarifies when classical linkage rules (single, complete, and average) recover the finest hierarchy while Ward’s linkage may fail. It also connects to broader topics including axiomatic characterizations, optimization-based objectives, and probabilistic tree inference, and discusses potential extensions to symbolic ultrametrics and graph-based planted hierarchies with empirical validation. Overall, the approach provides a robust, non-binary, validity-driven foundation for hierarchical clustering and establishes conditions under which linkage-based methods can be made hierarchy-faithful regardless of the specific merging rule.

Abstract

Hierarchical clustering seeks to uncover nested structures in data by constructing a tree of clusters, where deeper levels reveal finer-grained relationships. Traditional methods, including linkage approaches, face three major limitations: (i) they always return a hierarchy, even if none exists, (ii) they are restricted to binary trees, even if the true hierarchy is non-binary, and (iii) they are highly sensitive to the choice of linkage function. In this paper, we address these issues by introducing the notion of a valid hierarchy and defining a partial order over the set of valid hierarchies. We prove the existence of a finest valid hierarchy, that is, the hierarchy that encodes the maximum information consistent with the similarity structure of the data set. In particular, the finest valid hierarchy is not constrained to binary structures and, when no hierarchical relationships exist, collapses to a star tree. We propose a simple two-step algorithm that first constructs a binary tree via a linkage method and then prunes it to enforce validity. We establish necessary and sufficient conditions on the linkage function under which this procedure exactly recovers the finest valid hierarchy, and we show that all linkage functions satisfying these conditions yield the same hierarchy after pruning. Notably, classical linkage rules such as single, complete, and average satisfy these conditions, whereas Ward's linkage fails to do so.

Hierarchical Linkage Clustering Beyond Binary Trees and Ultrametrics

TL;DR

This work tackles fundamental limitations of traditional hierarchical clustering—binary constraints, forced hierarchies when none exist, and linkage sensitivity—by introducing the finest valid hierarchy , the unique greatest element in the set of hierarchies compatible with a similarity function . It proposes a practical two-step procedure that first builds a binary tree using a linkage rule and then prunes invalid splits to enforce hierarchy validity; under two necessary and sufficient conditions on the linkage update function , the pruning recovers exactly, with all such linkages yielding the same pruned hierarchy. The framework generalizes the ultrametric-dendrogram correspondence to arbitrary similarities and clarifies when classical linkage rules (single, complete, and average) recover the finest hierarchy while Ward’s linkage may fail. It also connects to broader topics including axiomatic characterizations, optimization-based objectives, and probabilistic tree inference, and discusses potential extensions to symbolic ultrametrics and graph-based planted hierarchies with empirical validation. Overall, the approach provides a robust, non-binary, validity-driven foundation for hierarchical clustering and establishes conditions under which linkage-based methods can be made hierarchy-faithful regardless of the specific merging rule.

Abstract

Hierarchical clustering seeks to uncover nested structures in data by constructing a tree of clusters, where deeper levels reveal finer-grained relationships. Traditional methods, including linkage approaches, face three major limitations: (i) they always return a hierarchy, even if none exists, (ii) they are restricted to binary trees, even if the true hierarchy is non-binary, and (iii) they are highly sensitive to the choice of linkage function. In this paper, we address these issues by introducing the notion of a valid hierarchy and defining a partial order over the set of valid hierarchies. We prove the existence of a finest valid hierarchy, that is, the hierarchy that encodes the maximum information consistent with the similarity structure of the data set. In particular, the finest valid hierarchy is not constrained to binary structures and, when no hierarchical relationships exist, collapses to a star tree. We propose a simple two-step algorithm that first constructs a binary tree via a linkage method and then prunes it to enforce validity. We establish necessary and sufficient conditions on the linkage function under which this procedure exactly recovers the finest valid hierarchy, and we show that all linkage functions satisfying these conditions yield the same hierarchy after pruning. Notably, classical linkage rules such as single, complete, and average satisfy these conditions, whereas Ward's linkage fails to do so.

Paper Structure

This paper contains 48 sections, 16 theorems, 84 equations, 4 figures, 1 table, 2 algorithms.

Key Result

Lemma 1

Let $C_1, C_2 \subseteq \mathcal{X}$ be two valid clusters with respect to $s$. Then $C_1 \cap C_2 \in \{ \emptyset, C_1, C_2 \}$.

Figures (4)

  • Figure 1: Three trees $T_1$, $T_2$ and $T_3$ belonging to $\mathcal{T}(\{x_1,x_2,x_3,x_4,x_5\})$. In terms of Definition \ref{['definition:set_representation_tree']}, these three trees are explicitly written as (a) $T_1 = \{ \{x_1\}, \{x_2\}, \{x_3\}, \{x_4\}, \{x_5\}, \{x_1,x_2,x_3\}, \{x_1,x_2,x_3,x_4,x_5\} \}$, (b) $T_2 = \{ \{x_1\}, \{x_2\}, \{x_3\}, \{x_4\}, \{x_5\}, \{x_1,x_2,x_3\}, \{x_4,x_5\}, \{x_1,x_2,x_3,x_4,x_5\} \}$, and (c) $T_3 = \{ \{x_1\}, \{x_2\}, \{x_3\}, \{x_4\}, \{x_5\}, \{x_1,x_2\}, \{x_3,x_4,x_5\}, \{x_1,x_2,x_3,x_4,x_5\} \}$.
  • Figure 2: Ultrametric and non-ultrametric similarities whose true hierarchy is the star tree $T_0$ given in Figure \ref{['fig:ultrametric_similarity_star_tree_tree']}. Figures \ref{['fig:ultrametric_similarity_star_tree_ultrametric']}, \ref{['fig:ultrametric_similarity_star_tree_nonultrametric']}, and \ref{['fig:ultrametric_similarity_star_tree_nonultrametric']} provide three similarity functions $s_1$, $s_2$, and $s_3$ such that $T_* (\mathcal{X},s_1) = T_* (\mathcal{X},s_2) = T_* (\mathcal{X},s_3) = T_0$. The similarity function $s_1$ is an ultrametric, whereas $s_2$ and $s_3$ are not.
  • Figure 3: Ultrametric and non-ultrametric similarities whose true hierarchy is the tree $T$ given in Figure \ref{['fig:ultrametric_similarity_nonstar_tree']}. Figures \ref{['fig:ultrametric_similarity_nonstar_tree_ultrametric']} and \ref{['fig:ultrametric_similarity_nonstar_tree_nonultrametric']} provide two similarity functions $s_1$ and $s_2$ such that $T_* (\mathcal{X},s_1) = T_* (\mathcal{X},s_2) = T$. The similarity function $s_1$ is an ultrametric, whereas $s_2$ is not.
  • Figure 4: Figure \ref{['fig:ward_fails_dissimilarity']} gives a dissimilarity function over a set of 5 items. The valid hierarchy is provided in Figure \ref{['fig:ward_fails_trueHierarchy']}, and Figure \ref{['fig:ward_fails_tree_ward']} shows the tree constructed by Ward linkage.

Theorems & Definitions (35)

  • Definition 1
  • Definition 2: Valid Cluster
  • Lemma 1: Validity implies laminarity
  • proof
  • Definition 3
  • Example 1
  • Definition 4: Partial order on $\mathcal{T}(\mathcal{X})$
  • Theorem 2: Existence of the Finest Valid Hierarchy
  • proof
  • Lemma 3
  • ...and 25 more