Table of Contents
Fetching ...

Can Computational Reducibility Lead to Transferable Models for Graph Combinatorial Optimization?

Semih Cantürk, Thomas Sabourin, Frederik Wenkel, Michael Perlmutter, Guy Wolf

TL;DR

The findings indicate that learning common representations across multiple graph CO problems is viable through the use of expressive message passing coupled with pretraining strategies that are informed by the polynomial reduction literature, thereby taking an important step towards enabling the development of foundational models for neural CO.

Abstract

A key challenge in deriving unified neural solvers for combinatorial optimization (CO) is efficient generalization of models between a given set of tasks to new tasks not used during the initial training process. To address it, we first establish a new model, which uses a GCON module as a form of expressive message passing together with energy-based unsupervised loss functions. This model achieves high performance (often comparable with state-of-the-art results) across multiple CO tasks when trained individually on each task. We then leverage knowledge from the computational reducibility literature to propose pretraining and fine-tuning strategies that transfer effectively (a) between MVC, MIS and MaxClique, and (b) in a multi-task learning setting that additionally incorporates MaxCut, MDS and graph coloring. Additionally, in a leave-one-out, multi-task learning setting, we observe that pretraining on all but one task almost always leads to faster convergence on the remaining task when fine-tuning while avoiding negative transfer. Our findings indicate that learning common representations across multiple graph CO problems is viable through the use of expressive message passing coupled with pretraining strategies that are informed by the polynomial reduction literature, thereby taking an important step towards enabling the development of foundational models for neural CO. We provide an open-source implementation of our work at https://github.com/semihcanturk/COPT-MT .

Can Computational Reducibility Lead to Transferable Models for Graph Combinatorial Optimization?

TL;DR

The findings indicate that learning common representations across multiple graph CO problems is viable through the use of expressive message passing coupled with pretraining strategies that are informed by the polynomial reduction literature, thereby taking an important step towards enabling the development of foundational models for neural CO.

Abstract

A key challenge in deriving unified neural solvers for combinatorial optimization (CO) is efficient generalization of models between a given set of tasks to new tasks not used during the initial training process. To address it, we first establish a new model, which uses a GCON module as a form of expressive message passing together with energy-based unsupervised loss functions. This model achieves high performance (often comparable with state-of-the-art results) across multiple CO tasks when trained individually on each task. We then leverage knowledge from the computational reducibility literature to propose pretraining and fine-tuning strategies that transfer effectively (a) between MVC, MIS and MaxClique, and (b) in a multi-task learning setting that additionally incorporates MaxCut, MDS and graph coloring. Additionally, in a leave-one-out, multi-task learning setting, we observe that pretraining on all but one task almost always leads to faster convergence on the remaining task when fine-tuning while avoiding negative transfer. Our findings indicate that learning common representations across multiple graph CO problems is viable through the use of expressive message passing coupled with pretraining strategies that are informed by the polynomial reduction literature, thereby taking an important step towards enabling the development of foundational models for neural CO. We provide an open-source implementation of our work at https://github.com/semihcanturk/COPT-MT .
Paper Structure (20 sections, 1 theorem, 6 equations, 2 figures, 7 tables)

This paper contains 20 sections, 1 theorem, 6 equations, 2 figures, 7 tables.

Key Result

Lemma 2.1

For any graph $G = (V, E)$ and subset $V' \subseteq V$, the following statements are equivalent:

Figures (2)

  • Figure 1: (Left) Reduction between tasks for pairwise transferability section: MIS/MVC are complements, but MaxClique is based on an auxiliary graph. (Right) Multi-task learning / fine-tuning architecture (pretraining set in green, fine-tuning set in purple)
  • Figure 2: Transfer performance on MVC and MIS. We compare training from scratch against frozen and unfrozen fine-tuning on the other task (MVC on MIS and MIS on MVC), the rest of tasks (MCut, MClique, MDS, $K$-coloring), and all tasks (rest + other) for BA-small graphs.

Theorems & Definitions (1)

  • Lemma 2.1: garey2002computers