An Evaluation Framework for Product Images Background Inpainting based on Human Feedback and Product Consistency

Yuqi Liang; Jun Luo; Xiaoxi Guo; Jianqi Bi

An Evaluation Framework for Product Images Background Inpainting based on Human Feedback and Product Consistency

Yuqi Liang, Jun Luo, Xiaoxi Guo, Jianqi Bi

TL;DR

The paper tackles evaluating AI-based background inpainting for product images, where background appropriateness and product preservation are crucial yet poorly captured by existing metrics. It introduces HFPC, a dual-module framework with an image-referenced reward model based on BLIP to rate background quality and a product-consistency evaluator using segmentation (EfficientSAM) guided by GroundingDino to ensure products remain faithful after inpainting. A large HFPC-44k dataset (~44k image pairs) with human labels is built and used to train these components, including data-balancing across 25 product categories. Empirical results show state-of-the-art precision (96.4%) and substantial reductions in manual annotation needs, with ablations and visualizations clarifying the contribution of each module and the attention mechanisms. The work promises practical impact for e-commerce pipelines and opens up future work on online feedback and reinforcement-learning–based improvements to generative models.

Abstract

In product advertising applications, the automated inpainting of backgrounds utilizing AI techniques in product images has emerged as a significant task. However, the techniques still suffer from issues such as inappropriate background and inconsistent product in generated product images, and existing approaches for evaluating the quality of generated product images are mostly inconsistent with human feedback causing the evaluation for this task to depend on manual annotation. To relieve the issues above, this paper proposes Human Feedback and Product Consistency (HFPC), which can automatically assess the generated product images based on two modules. Firstly, to solve inappropriate backgrounds, human feedback on 44,000 automated inpainting product images is collected to train a reward model based on multi-modal features extracted from BLIP and comparative learning. Secondly, to filter generated product images containing inconsistent products, a fine-tuned segmentation model is employed to segment the product of the original and generated product images and then compare the differences between the above two. Extensive experiments have demonstrated that HFPC can effectively evaluate the quality of generated product images and significantly reduce the expense of manual annotation. Moreover, HFPC achieves state-of-the-art(96.4% in precision) in comparison to other open-source visual-quality-assessment models. Dataset and code are available at: https://github.com/created-Bi/background_inpainting_products_dataset

An Evaluation Framework for Product Images Background Inpainting based on Human Feedback and Product Consistency

TL;DR

Abstract

An Evaluation Framework for Product Images Background Inpainting based on Human Feedback and Product Consistency

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)