Auto prompting without training labels: An LLM cascade for product quality assessment in e-commerce catalogs

Soham Satyadharma; Fatemeh Sheikholeslami; Swati Kaul; Aziz Umit Batur; Suleiman A. Khan

Auto prompting without training labels: An LLM cascade for product quality assessment in e-commerce catalogs

Soham Satyadharma, Fatemeh Sheikholeslami, Swati Kaul, Aziz Umit Batur, Suleiman A. Khan

TL;DR

The paper tackles scalable product-catalog quality assessment without training labels by introducing a training-free auto-prompt cascade that automatically generates and refines PC-SA instructions. It bootstraps from a small set of human prompts and iteratively creates domain-specific instructions to guide LLMs in correctness and applicability tasks across tens of thousands of PC-SA pairs and multiple languages. The approach achieves consistent 8–10% gains in precision/recall and a 99% reduction in expert prompting effort, while generalizing across languages and several foundational LLMs. This work demonstrates a practical, scalable method for domain-adapted prompt synthesis in e-commerce, with potential extensions to other catalog tasks and a need for intrinsic instruction-quality metrics.

Abstract

We introduce a novel, training free cascade for auto-prompting Large Language Models (LLMs) to assess product quality in e-commerce. Our system requires no training labels or model fine-tuning, instead automatically generating and refining prompts for evaluating attribute quality across tens of thousands of product category-attribute pairs. Starting from a seed of human-crafted prompts, the cascade progressively optimizes instructions to meet catalog-specific requirements. This approach bridges the gap between general language understanding and domain-specific knowledge at scale in complex industrial catalogs. Our extensive empirical evaluations shows the auto-prompt cascade improves precision and recall by $8-10\%$ over traditional chain-of-thought prompting. Notably, it achieves these gains while reducing domain expert effort from 5.1 hours to 3 minutes per attribute - a $99\%$ reduction. Additionally, the cascade generalizes effectively across five languages and multiple quality assessment tasks, consistently maintaining performance gains.

Auto prompting without training labels: An LLM cascade for product quality assessment in e-commerce catalogs

TL;DR

Abstract

Auto prompting without training labels: An LLM cascade for product quality assessment in e-commerce catalogs

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)