LossAgent: Towards Any Optimization Objectives for Image Processing with LLM Agents
Bingchen Li, Xin Li, Yiting Lu, Zhibo Chen
TL;DR
LossAgent addresses the challenge of optimizing low-level image processing toward arbitrary, including non-differentiable, objectives by using an LLM-based loss agent. It introduces a weighted compositional loss repository and optimization-oriented prompts to translate external, possibly textual, feedback into stage-wise loss weights, enabling end-to-end training with diverse goals. The approach is validated on three image processing tasks, showing improvements over baselines across multiple IQA metrics and objective setups, and demonstrates robustness to different backbones and prompts. This work suggests a practical path toward flexible, human-aligned image processing optimization in real-world applications.
Abstract
We present the first loss agent, dubbed LossAgent, for low-level image processing tasks, e.g., image super-resolution and restoration, intending to achieve any customized optimization objectives of low-level image processing in different practical applications. Notably, not all optimization objectives, such as complex hand-crafted perceptual metrics, text description, and intricate human feedback, can be instantiated with existing low-level losses, e.g., MSE loss, which presents a crucial challenge in optimizing image processing networks in an end-to-end manner. To eliminate this, our LossAgent introduces the powerful large language model (LLM) as the loss agent, where the rich textual understanding of prior knowledge empowers the loss agent with the potential to understand complex optimization objectives, trajectory, and state feedback from external environments in the optimization process of the low-level image processing networks. In particular, we establish the loss repository by incorporating existing loss functions that support the end-to-end optimization for low-level image processing. Then, we design the optimization-oriented prompt engineering for the loss agent to actively and intelligently decide the compositional weights for each loss in the repository at each optimization interaction, thereby achieving the required optimization trajectory for any customized optimization objectives. Extensive experiments on three typical low-level image processing tasks and multiple optimization objectives have shown the effectiveness and applicability of our proposed LossAgent.
