Table of Contents
Fetching ...

Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation

Haoyu Lei, Kaiwen Zhou, Yinchuan Li, Zhitang Chen, Farzan Farnia

TL;DR

This work proposes a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training.

Abstract

Diffusion-based Neural Combinatorial Optimization (NCO) has demonstrated effectiveness in solving NP-complete (NPC) problems by learning discrete diffusion models for solution generation, eliminating hand-crafted domain knowledge. Despite their success, existing NCO methods face significant challenges in both cross-scale and cross-problem generalization, and high training costs compared to traditional solvers. While recent studies on diffusion models have introduced training-free guidance approaches that leverage pre-defined guidance functions for conditional generation, such methodologies have not been extensively explored in combinatorial optimization. To bridge this gap, we propose a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training. We provide theoretical analysis that helps understanding the cross-problem transfer capability. Our experimental results demonstrate that a diffusion solver, trained exclusively on the Traveling Salesman Problem (TSP), can achieve competitive zero-shot transfer performance across different problem scales on TSP variants, such as Prize Collecting TSP (PCTSP) and the Orienteering Problem (OP), through inference time adaptation.

Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation

TL;DR

This work proposes a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training.

Abstract

Diffusion-based Neural Combinatorial Optimization (NCO) has demonstrated effectiveness in solving NP-complete (NPC) problems by learning discrete diffusion models for solution generation, eliminating hand-crafted domain knowledge. Despite their success, existing NCO methods face significant challenges in both cross-scale and cross-problem generalization, and high training costs compared to traditional solvers. While recent studies on diffusion models have introduced training-free guidance approaches that leverage pre-defined guidance functions for conditional generation, such methodologies have not been extensively explored in combinatorial optimization. To bridge this gap, we propose a training-free inference time adaptation framework (DIFU-Ada) that enables both the zero-shot cross-problem transfer and cross-scale generalization capabilities of diffusion-based NCO solvers without requiring additional training. We provide theoretical analysis that helps understanding the cross-problem transfer capability. Our experimental results demonstrate that a diffusion solver, trained exclusively on the Traveling Salesman Problem (TSP), can achieve competitive zero-shot transfer performance across different problem scales on TSP variants, such as Prize Collecting TSP (PCTSP) and the Orienteering Problem (OP), through inference time adaptation.
Paper Structure (20 sections, 1 theorem, 28 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 20 sections, 1 theorem, 28 equations, 7 figures, 5 tables, 1 algorithm.

Key Result

Theorem C.2

For a non-empty subset of nodes $S\subseteq V$, let $\textup{TSP}(S)$ and $\textup{argTSP}(S)$ denote the optimal cost and optimal tours of TSP on the subgraph specified by $S$. Under AssumptionsWe assume simpler setups for PCTSP and OP for clearer presentation of the underlying similarities. These and the optimal tours of OP are $\textup{argTSP}(V\setminus S_{\textup{OP}})$, where Here $D_{\tex

Figures (7)

  • Figure 1: The proposed Inference Time Adaptation framework. This approach combines (1) energy-guided sampling, which incorporates problem-specific objectives and constraints, with (2) a recursive renoising-denosing travel for solution refinement, enabling zero-shot cross-problem transfer without training. The Optimality Gap ($\downarrow$) on PCTSP-20 is reduced from $19.21\%$ to $4.20\%$.
  • Figure 2: Overview of recursive renoising-denoising travel in Inference Time Adaptation for achieving zero-shot cross-problem generalization, sequentially shifting from pre-trained problem $G$ (TSP) solution distribution to target problem distribution $G'$ (PCTSP).
  • Figure 3: Ablation studies of the number of recursive travel steps on the trade-off between optimality gap (%) and inference time (s) for PCTSP and OP.
  • Figure 4: The Optimality Gap of performing recursive travel with or without energy-guided sampling on PCTSP and OP.
  • Figure 5: Optimality gap changes with respect to the guided temperature ($-\lambda$ as the x-axis label) on PCTSP and OP.
  • ...and 2 more figures

Theorems & Definitions (5)

  • Definition 4.1: Energy Potential for Prize Collecting TSP (PCTSP)
  • Definition 4.2: Energy Potential for Orienteering Problem (OP)
  • Definition C.1: Marginal Decrease
  • Theorem C.2
  • Definition C.3: Marginal Decrease