Table of Contents
Fetching ...

Many-to-Many Matching via Sparsity Controlled Optimal Transport

Weijie Liu, Han Bao, Makoto Yamada, Zenan Huang, Nenggan Zheng, Hui Qian

TL;DR

A novel many-to-many matching method to explicitly encode many-to-many constraints while preventing the degeneration into one-to-one matching that achieves good performance by gleaning meaningful many-to-many matchings.

Abstract

Many-to-many matching seeks to match multiple points in one set and multiple points in another set, which is a basis for a wide range of data mining problems. It can be naturally recast in the framework of Optimal Transport (OT). However, existing OT methods either lack the ability to accomplish many-to-many matching or necessitate careful tuning of a regularization parameter to achieve satisfactory results. This paper proposes a novel many-to-many matching method to explicitly encode many-to-many constraints while preventing the degeneration into one-to-one matching. The proposed method consists of the following two components. The first component is the matching budget constraints on each row and column of a transport plan, which specify how many points can be matched to a point at most. The second component is the deformed $q$-entropy regularization, which encourages a point to meet the matching budget maximally. While the deformed $q$-entropy was initially proposed to sparsify a transport plan, we employ it to avoid the degeneration into one-to-one matching. We optimize the objective via a penalty algorithm, which is efficient and theoretically guaranteed to converge. Experimental results on various tasks demonstrate that the proposed method achieves good performance by gleaning meaningful many-to-many matchings.

Many-to-Many Matching via Sparsity Controlled Optimal Transport

TL;DR

A novel many-to-many matching method to explicitly encode many-to-many constraints while preventing the degeneration into one-to-one matching that achieves good performance by gleaning meaningful many-to-many matchings.

Abstract

Many-to-many matching seeks to match multiple points in one set and multiple points in another set, which is a basis for a wide range of data mining problems. It can be naturally recast in the framework of Optimal Transport (OT). However, existing OT methods either lack the ability to accomplish many-to-many matching or necessitate careful tuning of a regularization parameter to achieve satisfactory results. This paper proposes a novel many-to-many matching method to explicitly encode many-to-many constraints while preventing the degeneration into one-to-one matching. The proposed method consists of the following two components. The first component is the matching budget constraints on each row and column of a transport plan, which specify how many points can be matched to a point at most. The second component is the deformed -entropy regularization, which encourages a point to meet the matching budget maximally. While the deformed -entropy was initially proposed to sparsify a transport plan, we employ it to avoid the degeneration into one-to-one matching. We optimize the objective via a penalty algorithm, which is efficient and theoretically guaranteed to converge. Experimental results on various tasks demonstrate that the proposed method achieves good performance by gleaning meaningful many-to-many matchings.

Paper Structure

This paper contains 44 sections, 3 theorems, 27 equations, 3 figures, 12 tables, 1 algorithm.

Key Result

Theorem 3.1

Let $\mathcal{J}'(h)$ (resp. $\mathcal{J}(h)$) be a subset of $\llbracket n\rrbracket$ (resp. $\llbracket m\rrbracket$) with cardinality $h$. When the following conditions hold: and the feasible domain $\Pi(\mathbf{a},\mathbf{b})\cap\Omega_{\rho^{\operatorname{s}},\rho^{\operatorname{t}}}$ is non-empty.

Figures (3)

  • Figure 1: The matching results (left) and the transport plans (right) between two uniform discrete measures obtained by the network simplex solver for OT (top) and our method (bottom), respectively. The red +'s and the blue $\times$'s denote the support points of the two measures, respectively. The darker the color is, the larger the value of the corresponding entry in the transport plan is. Since the support points do not have strict one-to-one correspondence, many-to-many matching is more appropriate.
  • Figure 2: A comparison between many-to-many matching (top) and cluster-to-cluster matching (bottom) in the student course allocation problem. The sets of students and courses are denoted as $\{1,2,\dots,12\}$ and $\{a,b,\dots,k\}$, respectively. Each student can take up to 4 courses and each course can admit up to 6 students. In the cluster-to-cluster matching, the students are divided into the three clusters (groups), and all students in each cluster take exactly the same courses.
  • Figure 3: A comparison of the transport plans obtained by the network simplex solver (N.S.), sparsity-constrained OT (S.C.), $q$-regularized OT ($q$-DOT), structured OT (SOT), and SCOTM. The darker the color is, the larger the value of the corresponding entry is in the transport plan.

Theorems & Definitions (5)

  • Theorem 3.1
  • Theorem 3.2
  • Example 3.3
  • Definition 3.4
  • Theorem 3.5