Table of Contents
Fetching ...

Human + AI for Accelerating Ad Localization Evaluation

Harshit Rajgarhia, Shivali Dalmia, Mengyang Zhao, Mukherji Abhishek, Kiran Ganesh

TL;DR

The paper tackles ad localization by addressing the need to preserve visual layout and typography across languages. It introduces a modular, AI-assisted pipeline that integrates scene-text detection, background inpainting, machine translation, and typography-aware reimposition, with a human-in-the-loop for QA. Key contributions include a CNN- and diffusion-based inpainting workflow, a font- and layout-aware text reimposition module driven by EfficientNet-B3 and LLMs, and an end-to-end evaluation across six locales demonstrating low perceptual distortion. The approach promises substantial reductions in manual effort and rapid deployment in real-world workflows, setting the stage for broader industrial adoption and future enhancements in curved/stylized text handling and semantic alignment.

Abstract

Adapting advertisements for multilingual audiences requires more than simple text translation; it demands preservation of visual consistency, spatial alignment, and stylistic integrity across diverse languages and formats. We introduce a structured framework that combines automated components with human oversight to address the complexities of advertisement localization. To the best of our knowledge, this is the first work to integrate scene text detection, inpainting, machine translation (MT), and text reimposition specifically for accelerating ad localization evaluation workflows. Qualitative results across six locales demonstrate that our approach produces semantically accurate and visually coherent localized advertisements, suitable for deployment in real-world workflows.

Human + AI for Accelerating Ad Localization Evaluation

TL;DR

The paper tackles ad localization by addressing the need to preserve visual layout and typography across languages. It introduces a modular, AI-assisted pipeline that integrates scene-text detection, background inpainting, machine translation, and typography-aware reimposition, with a human-in-the-loop for QA. Key contributions include a CNN- and diffusion-based inpainting workflow, a font- and layout-aware text reimposition module driven by EfficientNet-B3 and LLMs, and an end-to-end evaluation across six locales demonstrating low perceptual distortion. The approach promises substantial reductions in manual effort and rapid deployment in real-world workflows, setting the stage for broader industrial adoption and future enhancements in curved/stylized text handling and semantic alignment.

Abstract

Adapting advertisements for multilingual audiences requires more than simple text translation; it demands preservation of visual consistency, spatial alignment, and stylistic integrity across diverse languages and formats. We introduce a structured framework that combines automated components with human oversight to address the complexities of advertisement localization. To the best of our knowledge, this is the first work to integrate scene text detection, inpainting, machine translation (MT), and text reimposition specifically for accelerating ad localization evaluation workflows. Qualitative results across six locales demonstrate that our approach produces semantically accurate and visually coherent localized advertisements, suitable for deployment in real-world workflows.

Paper Structure

This paper contains 9 sections, 1 equation, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Ad Localization Evaluation Process
  • Figure 2: Inpainting workflow: (1.) Letter boxing. (2.) Inpainting.
  • Figure 3: Postprocessing workflow: (1.) Removing Letterbox. (2.) Selective Recombination.
  • Figure 4: Example of tested Ad samples