DFF: Decision-Focused Fine-tuning for Smarter Predict-then-Optimize with Limited Data
Jiaqi Yang, Enming Liang, Zicheng Su, Zhichao Zou, Peng Zhen, Jiecheng Guo, Wanjing Ma, Kun An
TL;DR
This work tackles decision-focused learning (DFL) in predict-then-optimize (PO) when data are limited and backbones may be non-differentiable. It introduces Decision-Focused Fine-Tuning (DFF), a modular bias-correction layer that enforces a trust-region constraint, ensuring predictions stay close to the original model while optimizing the decision loss; theoretical bounds guarantee bounded prediction bias and controlled directional change. Empirically, DFF improves downstream decision quality across synthetic network flow and portfolio problems, as well as real-world resource allocation tasks, even when fine-tuning non-differentiable simulation models or non-differentiable backbones like XGBoost. The results demonstrate robust performance gains and stability under limited data, highlighting DFF’s broad applicability to PO tasks and its potential to synergize with diverse predictive backbones. Overall, DFF provides a principled, generalizable approach to enhancing decision quality without sacrificing physical meaningfulness or requiring differentiability of the upstream model.
Abstract
Decision-focused learning (DFL) offers an end-to-end approach to the predict-then-optimize (PO) framework by training predictive models directly on decision loss (DL), enhancing decision-making performance within PO contexts. However, the implementation of DFL poses distinct challenges. Primarily, DL can result in deviation from the physical significance of the predictions under limited data. Additionally, some predictive models are non-differentiable or black-box, which cannot be adjusted using gradient-based methods. To tackle the above challenges, we propose a novel framework, Decision-Focused Fine-tuning (DFF), which embeds the DFL module into the PO pipeline via a novel bias correction module. DFF is formulated as a constrained optimization problem that maintains the proximity of the DL-enhanced model to the original predictive model within a defined trust region. We theoretically prove that DFF strictly confines prediction bias within a predetermined upper bound, even with limited datasets, thereby substantially reducing prediction shifts caused by DL under limited data. Furthermore, the bias correction module can be integrated into diverse predictive models, enhancing adaptability to a broad range of PO tasks. Extensive evaluations on synthetic and real-world datasets, including network flow, portfolio optimization, and resource allocation problems with different predictive models, demonstrate that DFF not only improves decision performance but also adheres to fine-tuning constraints, showcasing robust adaptability across various scenarios.
