Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

Ishaan Malhi; Praneet Dutta; Ellie Talius; Sally Ma; Brendan Driscoll; Krista Holden; Garima Pruthi; Arunachalam Narayanaswamy

Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

Ishaan Malhi, Praneet Dutta, Ellie Talius, Sally Ma, Brendan Driscoll, Krista Holden, Garima Pruthi, Arunachalam Narayanaswamy

TL;DR

This work tackles the fidelity gap in product recontextualization by introducing a diffusion-based framework augmented with synthetic data pipelines. It combines novel view generation, background disentanglement via outpainting, and negative counterfactuals, followed by captioning, data filtering, and LoRA-based finetuning to preserve product details across diverse contexts. Post-finetuning ranking using multimodal embeddings selects high-quality generations, achieving higher human- and metric-based fidelity than baselines and enabling realistic relighting, occlusions, and novel viewpoints at scale. The approach demonstrates strong performance on ABO and a private dataset, offering practical implications for e-commerce and virtual product showcasing without requiring extensive model surgery. Overall, the paper advances scalable, high-fidelity product recontextualization by tightly integrating data augmentation, perceptual alignment, and efficient finetuning strategies.

Abstract

We present a framework for high-fidelity product image recontextualization using text-to-image diffusion models and a novel data augmentation pipeline. This pipeline leverages image-to-video diffusion, in/outpainting & negatives to create synthetic training data, addressing limitations of real-world data collection for this task. Our method improves the quality and diversity of generated images by disentangling product representations and enhancing the model's understanding of product characteristics. Evaluation on the ABO dataset and a private product dataset, using automated metrics and human assessment, demonstrates the effectiveness of our framework in generating realistic and compelling product visualizations, with implications for applications such as e-commerce and virtual product showcasing.

Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

TL;DR

Abstract

Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)