A Semidefinite Programming-Based Branch-and-Cut Algorithm for Biclustering

Antonio M. Sudoso

A Semidefinite Programming-Based Branch-and-Cut Algorithm for Biclustering

Antonio M. Sudoso

TL;DR

This work addresses the k-densest-disjoint biclique (k-DDB) biclustering problem by developing a tailored semidefinite programming (SDP) based branch-and-cut algorithm. It leverages an SDP relaxation (lifted to a rank-constrained form) with strengthening valid inequalities and a cutting-plane scheme to obtain tight upper bounds, while a rounding-based heuristic yields high-quality feasible biclusters for lower bounds. A specialized branching strategy reduces problem size by enforcing must-link and cannot-link constraints within lower-dimensional SDP subproblems. Computational results on synthetic and real-world gene-expression datasets show that the method can solve instances up to about 1248 vertices—roughly 20x larger than what general solvers can handle—highlighting the practical scalability and robustness of the approach. The solver is publicly available, enabling reproducibility and further exploration of global biclustering optimizations.

Abstract

Biclustering, also called co-clustering, block clustering, or two-way clustering, involves the simultaneous clustering of both the rows and columns of a data matrix into distinct groups, such that the rows and columns within a group display similar patterns. As a model problem for biclustering, we consider the $k$-densest-disjoint biclique problem, whose goal is to identify $k$ disjoint complete bipartite subgraphs (called bicliques) of a given weighted complete bipartite graph such that the sum of their densities is maximized. To address this problem, we present a tailored branch-and-cut algorithm. For the upper bound routine, we consider a semidefinite programming relaxation and propose valid inequalities to strengthen the bound. We solve this relaxation in a cutting-plane fashion using a first-order method. For the lower bound, we design a maximum weight matching rounding procedure that exploits the solution of the relaxation solved at each node. Computational results on both synthetic and real-world instances show that the proposed algorithm can solve instances approximately 20 times larger than those handled by general-purpose solvers.

A Semidefinite Programming-Based Branch-and-Cut Algorithm for Biclustering

TL;DR

Abstract

-densest-disjoint biclique problem, whose goal is to identify

disjoint complete bipartite subgraphs (called bicliques) of a given weighted complete bipartite graph such that the sum of their densities is maximized. To address this problem, we present a tailored branch-and-cut algorithm. For the upper bound routine, we consider a semidefinite programming relaxation and propose valid inequalities to strengthen the bound. We solve this relaxation in a cutting-plane fashion using a first-order method. For the lower bound, we design a maximum weight matching rounding procedure that exploits the solution of the relaxation solved at each node. Computational results on both synthetic and real-world instances show that the proposed algorithm can solve instances approximately 20 times larger than those handled by general-purpose solvers.

Paper Structure (14 sections, 7 theorems, 37 equations, 3 figures, 5 tables, 2 algorithms)

This paper contains 14 sections, 7 theorems, 37 equations, 3 figures, 5 tables, 2 algorithms.

Introduction
Related Work
Definitions and problem formulation
Branch-and-cut algorithm
SDP relaxation
Valid inequalities
Valid upper bounds
Heuristic
Branching subproblems
Computational results
Implementation details
Experiments on artificial instances
Experiments on real-world instances
Conclusions

Key Result

Proposition 3.4

Problems prob:origin and prob:or are equivalent.

Figures (3)

Figure 1: Different types of bicluster structures (after row and column reordering): (a) Overlapping biclusters, (b) non-overlapping biclusters with checkerboard structure, and (c) exclusive rows and columns biclusters with block-diagonal structure. The algorithm proposed in this paper identifies bicluster structures of type (c).
Figure 2: Percentage gap versus cutting-plane iterations at the root node. The “x” marker indicates that the global lower bound has been updated at the corresponding iteration.
Figure 3: Percentage gap versus cutting-plane iterations at the root node. The “x” marker indicates that the global lower bound has been updated at the corresponding iteration.

Theorems & Definitions (16)

Definition 3.1
Definition 3.2
Definition 3.3
Proposition 3.4
proof
Proposition 4.1
proof
Proposition 4.2
proof
Proposition 4.3
...and 6 more

A Semidefinite Programming-Based Branch-and-Cut Algorithm for Biclustering

TL;DR

Abstract

A Semidefinite Programming-Based Branch-and-Cut Algorithm for Biclustering

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (16)