Table of Contents
Fetching ...

Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation

Haoyu Lei, Kaiwen Zhou, Yinchuan Li, Zhitang Chen, Farzan Farnia

TL;DR

This work proposes a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training.

Abstract

Diffusion-based Neural Combinatorial Optimization (NCO) has demonstrated effectiveness in solving NP-complete (NPC) problems by learning discrete diffusion models for solution generation, eliminating hand-crafted domain knowledge. Despite their success, existing NCO methods face significant challenges in both cross-scale and cross-problem generalization, and high training costs compared to traditional solvers. While recent studies on diffusion models have introduced training-free guidance approaches that leverage pre-defined guidance functions for conditional generation, such methodologies have not been extensively explored in combinatorial optimization. To bridge this gap, we propose a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training. We provide theoretical analysis that helps understanding the cross-problem transfer capability. Our experimental results demonstrate that a diffusion solver, trained exclusively on the Traveling Salesman Problem (TSP), can achieve competitive zero-shot transfer performance across different problem scales on TSP variants, such as Prize Collecting TSP (PCTSP) and the Orienteering Problem (OP), through inference time adaptation.

Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation

TL;DR

This work proposes a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training.

Abstract

Diffusion-based Neural Combinatorial Optimization (NCO) has demonstrated effectiveness in solving NP-complete (NPC) problems by learning discrete diffusion models for solution generation, eliminating hand-crafted domain knowledge. Despite their success, existing NCO methods face significant challenges in both cross-scale and cross-problem generalization, and high training costs compared to traditional solvers. While recent studies on diffusion models have introduced training-free guidance approaches that leverage pre-defined guidance functions for conditional generation, such methodologies have not been extensively explored in combinatorial optimization. To bridge this gap, we propose a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training. We provide theoretical analysis that helps understanding the cross-problem transfer capability. Our experimental results demonstrate that a diffusion solver, trained exclusively on the Traveling Salesman Problem (TSP), can achieve competitive zero-shot transfer performance across different problem scales on TSP variants, such as Prize Collecting TSP (PCTSP) and the Orienteering Problem (OP), through inference time adaptation.

Paper Structure

This paper contains 20 sections, 1 theorem, 28 equations, 7 figures, 5 tables, 1 algorithm.

Key Result

Theorem C.2

For a non-empty subset of nodes $S\subseteq V$, let $\textup{TSP}(S)$ and $\textup{argTSP}(S)$ denote the optimal cost and optimal tours of TSP on the subgraph specified by $S$. Under AssumptionsWe assume simpler setups for PCTSP and OP for clearer presentation of the underlying similarities. These and the optimal tours of OP are $\textup{argTSP}(V\setminus S_{\textup{OP}})$, where Here $D_{\tex

Figures (7)

  • Figure 1: The proposed Inference Time Adaptation framework. This approach combines (1) energy-guided sampling, which incorporates problem-specific objectives and constraints, with (2) a recursive renoising-denosing travel for solution refinement, enabling zero-shot cross-problem transfer without training. The Optimality Gap ($\downarrow$) on PCTSP-20 is reduced from $19.21\%$ to $4.20\%$.
  • Figure 2: Overview of recursive renoising-denoising travel in Inference Time Adaptation for achieving zero-shot cross-problem generalization, sequentially shifting from pre-trained problem $G$ (TSP) solution distribution to target problem distribution $G'$ (PCTSP).
  • Figure 3: Ablation studies of the number of recursive travel steps on the trade-off between optimality gap (%) and inference time (s) for PCTSP and OP.
  • Figure 4: The Optimality Gap of performing recursive travel with or without energy-guided sampling on PCTSP and OP.
  • Figure 5: Optimality gap changes with respect to the guided temperature ($-\lambda$ as the x-axis label) on PCTSP and OP.
  • ...and 2 more figures

Theorems & Definitions (5)

  • Definition 4.1: Energy Potential for Prize Collecting TSP (PCTSP)
  • Definition 4.2: Energy Potential for Orienteering Problem (OP)
  • Definition C.1: Marginal Decrease
  • Theorem C.2
  • Definition C.3: Marginal Decrease