Divide, Harmonize, Then Conquer It: Shooting Multi-Commodity Flow Problems with Multimodal Language Models

Xinyu Yuan; Yan Qiao; Zonghui Wang; Wenzhi Chen

Divide, Harmonize, Then Conquer It: Shooting Multi-Commodity Flow Problems with Multimodal Language Models

Xinyu Yuan, Yan Qiao, Zonghui Wang, Wenzhi Chen

TL;DR

This work tackles the scalability of multi-commodity flow optimization by partitioning the problem by sources and solving subproblems in parallel with a shared multimodal language model acting as a decision-maker.Pram combines a partition module, MLM-based agents, and a lightweight adaptation framework using counterfactual policy gradients to harmonize subproblem solutions, with theoretical convergence guarantees in the MCF setting.Empirically, Pram achieves near-optimal performance, orders-of-magnitude faster runtimes on large networks, and strong robustness to unforeseen events, outperforming several ML baselines and matching LP performance when demands are well-predicted.The approach offers a practical, objective-agnostic pathway to integrating MLMs into production network allocation systems, while acknowledging limitations in fine-tuning cost and encoding biases and suggesting avenues for future refinement.

Abstract

The multi-commodity flow (MCF) problem is a fundamental topic in network flow and combinatorial optimization, with broad applications in transportation, communication, and logistics, etc. Nowadays, the rapid expansion of allocation systems has posed challenges for existing optimization engines in balancing optimality and tractability. In this paper, we present Pram, the first ML-based method that leverages the reasoning power of multimodal language models (MLMs) for addressing the trade-off dilemma -- a great need of service providers. As part of our proposal, Pram (i) quickly computes high-quality allocations by dividing the original problem into local subproblems, which are then resolved by an MLM-powered "agent", and (ii) ensures global consistency by harmonizing these subproblems via a multi-agent reinforcement learning algorithm. Theoretically, we show that Pram, which learns to perform gradient descent in context, provably converges to the optimum within the family of MCF problems. Empirically, on real-world datasets and public topologies, Pram achieves performance comparable to, and in some cases even surpassing, linear programming solvers (very close to the optimal solution), and substantially lower runtimes (1 to 2 orders of magnitude faster). Moreover, Pram exhibits strong robustness (<10\% performance degradation under link failures or flow bursts), demonstrating MLM's generalization ability to unforeseen events. Pram is objective-agnostic and seamlessly integrates with mainstream allocation systems, providing a practical and scalable solution for future networks.

Divide, Harmonize, Then Conquer It: Shooting Multi-Commodity Flow Problems with Multimodal Language Models

TL;DR

Abstract

Paper Structure (64 sections, 10 theorems, 41 equations, 28 figures, 5 tables, 10 algorithms)

This paper contains 64 sections, 10 theorems, 41 equations, 28 figures, 5 tables, 10 algorithms.

Introduction
Problem Formulation
Pram: Partitioned Resource Allocation with MLMs
Motivations behind Pram
Achilles’ Heel of LP-based Methods
Where ML-based Methods Fall Short
Multimodal Problem Partition
Lightweight Multi-Agent Adaptation
Communication
Adaptation
Understanding Pram: Case Study and Theory
Through the Looking Glass of MCF
What Makes Pram Tick
Main Results
Experiment on Real-World Datasets
...and 49 more sections

Key Result

Theorem 1

(Solving MCF with GD) Consider a GD algorithm with update rule $\pi^{(t+1)} = \pi^{(t)} - \eta v_t$. Then, there exists a step size $\eta > 0$ and a finite number of iterations $T$ such that $\mathcal{L}(\pi)$, the MCF objective function, attains the optimum up to an arbitrarily small error.

Figures (28)

Figure 1: Various real-world examples of multi-commodity flow. From left to right: wide-area network traffic engineering, urban mobility management, delivery route optimization, regional power dispatch, and tenant-aware flow control, all of which involve a very large solution space today.
Figure 2: Tradeoff space.
Figure 3: Overview of Pram. It consists of three core components: partition module to divide task into smaller sub-tasks, MLM-based agent module to to generate sub-task-specific answers, and adaptation module to efficiently learn global knowledge for MCF optimization.
Figure 3: Pretrained MLM's visual comprehension example on sub-topology with overlapping elements. The task involves extracting every visible link (edge) within the specified sub-topology and recording the two endpoint nodes $(A, B)$ and the associated numerical link capacity. The ($\checkmark$) and ($\times$) indicate the model's success or failure in identifying the corresponding link and its capacity value, demonstrating the visual comprehension level of technical diagrams.
Figure 4: Illustration of Pram's adaptation framework. In (a), Pram builds inter-agent communication through LoRA and reprogramming context using cross attention in (b). In (c), policy gradient flow is computed from each agent's difference reward to estimate the contribution of its actions to the team’s global reward.
...and 23 more figures

Theorems & Definitions (26)

Theorem 1
Lemma 1
Theorem 2
Definition 1
Definition 2
Definition 3
Definition 4
Definition 5
Definition 6
Lemma 2
...and 16 more

Divide, Harmonize, Then Conquer It: Shooting Multi-Commodity Flow Problems with Multimodal Language Models

TL;DR

Abstract

Divide, Harmonize, Then Conquer It: Shooting Multi-Commodity Flow Problems with Multimodal Language Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (28)

Theorems & Definitions (26)