OmniSearchSage: Multi-Task Multi-Entity Embeddings for Pinterest Search
Prabhat Agarwal, Minhazul Islam Sk, Nikil Pancha, Kurchi Subhra Hazra, Jiajing Xu, Chuck Rosenberg
TL;DR
OmniSearchSage introduces a unified, multi-task, multi-entity embedding framework that learns a single query representation aligned with pins and products. By enriching entity content with GenAI captions, board titles, and engaged queries, and by employing specialized encoders and a multi-task sampled softmax loss, the system achieves strong offline and online gains across retrieval, ranking, and ads tasks. The work demonstrates robust cross-language effectiveness, scalable serving with a cache-assisted architecture, and practical deployment across Pinterest search with notable improvements in Recall@10, organic fulfillment, and ads relevance. These findings underscore the value of end-to-end, end-user-focused embeddings that are compatible with existing production embeddings and capable of powering diverse search surfaces.
Abstract
In this paper, we present OmniSearchSage, a versatile and scalable system for understanding search queries, pins, and products for Pinterest search. We jointly learn a unified query embedding coupled with pin and product embeddings, leading to an improvement of $>8\%$ relevance, $>7\%$ engagement, and $>5\%$ ads CTR in Pinterest's production search system. The main contributors to these gains are improved content understanding, better multi-task learning, and real-time serving. We enrich our entity representations using diverse text derived from image captions from a generative LLM, historical engagement, and user-curated boards. Our multitask learning setup produces a single search query embedding in the same space as pin and product embeddings and compatible with pre-existing pin and product embeddings. We show the value of each feature through ablation studies, and show the effectiveness of a unified model compared to standalone counterparts. Finally, we share how these embeddings have been deployed across the Pinterest search stack, from retrieval to ranking, scaling to serve $300k$ requests per second at low latency. Our implementation of this work is available at https://github.com/pinterest/atg-research/tree/main/omnisearchsage.
