Table of Contents
Fetching ...

RePainter: Empowering E-commerce Object Removal via Spatial-matting Reinforcement Learning

Zipeng Guo, Lichen Ma, Xiaolong Fu, Gaojing Zhou, Lan Yang, Yuchen Zhou, Linkai Liu, Yu He, Ximan Liu, Shiping Dong, Jingling Fu, Zhen Chen, Yu Shi, Junshi Huang, Jason Li, Chao Gou

TL;DR

RePainter tackles the challenge of removing intrusive advertising elements from e-commerce product images by marrying reinforcement learning with diffusion-based inpainting. It introduces spatial-matting trajectory refinement to bias sampling toward background context and a local-global composite reward to avoid artifacts and reward hacking, all within a GRPO framework. The work also provides the EcomPaint-100K dataset and EcomPaint-Bench benchmark, enabling standardized evaluation in e-commerce scenarios. Empirical results show notable improvements over state-of-the-art methods in removal quality, structural coherence, and semantic validity, with strong human and GPT-4o assessments supporting practical utility.

Abstract

In web data, product images are central to boosting user engagement and advertising efficacy on e-commerce platforms, yet the intrusive elements such as watermarks and promotional text remain major obstacles to delivering clear and appealing product visuals. Although diffusion-based inpainting methods have advanced, they still face challenges in commercial settings due to unreliable object removal and limited domain-specific adaptation. To tackle these challenges, we propose Repainter, a reinforcement learning framework that integrates spatial-matting trajectory refinement with Group Relative Policy Optimization (GRPO). Our approach modulates attention mechanisms to emphasize background context, generating higher-reward samples and reducing unwanted object insertion. We also introduce a composite reward mechanism that balances global, local, and semantic constraints, effectively reducing visual artifacts and reward hacking. Additionally, we contribute EcomPaint-100K, a high-quality, large-scale e-commerce inpainting dataset, and a standardized benchmark EcomPaint-Bench for fair evaluation. Extensive experiments demonstrate that Repainter significantly outperforms state-of-the-art methods, especially in challenging scenes with intricate compositions. We will release our code and weights upon acceptance.

RePainter: Empowering E-commerce Object Removal via Spatial-matting Reinforcement Learning

TL;DR

RePainter tackles the challenge of removing intrusive advertising elements from e-commerce product images by marrying reinforcement learning with diffusion-based inpainting. It introduces spatial-matting trajectory refinement to bias sampling toward background context and a local-global composite reward to avoid artifacts and reward hacking, all within a GRPO framework. The work also provides the EcomPaint-100K dataset and EcomPaint-Bench benchmark, enabling standardized evaluation in e-commerce scenarios. Empirical results show notable improvements over state-of-the-art methods in removal quality, structural coherence, and semantic validity, with strong human and GPT-4o assessments supporting practical utility.

Abstract

In web data, product images are central to boosting user engagement and advertising efficacy on e-commerce platforms, yet the intrusive elements such as watermarks and promotional text remain major obstacles to delivering clear and appealing product visuals. Although diffusion-based inpainting methods have advanced, they still face challenges in commercial settings due to unreliable object removal and limited domain-specific adaptation. To tackle these challenges, we propose Repainter, a reinforcement learning framework that integrates spatial-matting trajectory refinement with Group Relative Policy Optimization (GRPO). Our approach modulates attention mechanisms to emphasize background context, generating higher-reward samples and reducing unwanted object insertion. We also introduce a composite reward mechanism that balances global, local, and semantic constraints, effectively reducing visual artifacts and reward hacking. Additionally, we contribute EcomPaint-100K, a high-quality, large-scale e-commerce inpainting dataset, and a standardized benchmark EcomPaint-Bench for fair evaluation. Extensive experiments demonstrate that Repainter significantly outperforms state-of-the-art methods, especially in challenging scenes with intricate compositions. We will release our code and weights upon acceptance.

Paper Structure

This paper contains 16 sections, 15 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: (Left) Excessive advertising elements in product images (e.g., price tags and text) often compromise the visual appeal of the images and adversely impact the user browsing experience. (Right) Previous methods FluxControlnetInpainting2024gong2025onerewardBlackForestLabsFlux2024wang2025towards tend to generate unintentional objects and struggle to remove the target object’s effects, leading to unrealistic outputs. In contrast, our RePainter seamlessly removes target objects while ensuring visual coherence in the generated images.
  • Figure 2: Overview of RePainter. We propose a novel reinforcement learning framework that integrates spatial-matting trajectory refinement with GRPO. The spatial-matting module modulates attention mechanisms to optimize the sampling trajectory during denoising, expanding the exploration space and guiding the generation of higher-reward samples. These samples are then evaluated by our local-global composite reward models, which jointly assesses global structural consistency, local pixel accuracy, and semantic validity. Rewards from these trajectories feed the GRPO loss, enabling online policy updates that align the model with e-commerce-specific visual preferences.
  • Figure 3: We first apply panoptic segmentation to the image to identify foreground (negative) and background (positive) regions. The spatial-matting strategy aims to make the masked area’s generation more attentive to the background context, while suppressing interference from distracting foreground objects (e.g., price tags or text), thereby reducing the generation of unwanted objects.
  • Figure 4: Qualitative results of all comparison methods in challenging scenarios. Our RePainter demonstrates superior capability in unwanted-object-mitigated and structural consistency.
  • Figure 5: A comparative analysis between standard GRPO-training and our spatial-matting trajectory refinement. We visualize the reward curves of our three proposed reward models. After applying spatial-matting, our model achieves optimal performance across all reward models while requiring fewer training iterations optimized.
  • ...and 4 more figures