Divide, Harmonize, Then Conquer It: Shooting Multi-Commodity Flow Problems with Multimodal Language Models
Xinyu Yuan, Yan Qiao, Zonghui Wang, Wenzhi Chen
TL;DR
This work tackles the scalability of multi-commodity flow optimization by partitioning the problem by sources and solving subproblems in parallel with a shared multimodal language model acting as a decision-maker.Pram combines a partition module, MLM-based agents, and a lightweight adaptation framework using counterfactual policy gradients to harmonize subproblem solutions, with theoretical convergence guarantees in the MCF setting.Empirically, Pram achieves near-optimal performance, orders-of-magnitude faster runtimes on large networks, and strong robustness to unforeseen events, outperforming several ML baselines and matching LP performance when demands are well-predicted.The approach offers a practical, objective-agnostic pathway to integrating MLMs into production network allocation systems, while acknowledging limitations in fine-tuning cost and encoding biases and suggesting avenues for future refinement.
Abstract
The multi-commodity flow (MCF) problem is a fundamental topic in network flow and combinatorial optimization, with broad applications in transportation, communication, and logistics, etc. Nowadays, the rapid expansion of allocation systems has posed challenges for existing optimization engines in balancing optimality and tractability. In this paper, we present Pram, the first ML-based method that leverages the reasoning power of multimodal language models (MLMs) for addressing the trade-off dilemma -- a great need of service providers. As part of our proposal, Pram (i) quickly computes high-quality allocations by dividing the original problem into local subproblems, which are then resolved by an MLM-powered "agent", and (ii) ensures global consistency by harmonizing these subproblems via a multi-agent reinforcement learning algorithm. Theoretically, we show that Pram, which learns to perform gradient descent in context, provably converges to the optimum within the family of MCF problems. Empirically, on real-world datasets and public topologies, Pram achieves performance comparable to, and in some cases even surpassing, linear programming solvers (very close to the optimal solution), and substantially lower runtimes (1 to 2 orders of magnitude faster). Moreover, Pram exhibits strong robustness (<10\% performance degradation under link failures or flow bursts), demonstrating MLM's generalization ability to unforeseen events. Pram is objective-agnostic and seamlessly integrates with mainstream allocation systems, providing a practical and scalable solution for future networks.
