Table of Contents
Fetching ...

JungleGPT: Designing and Optimizing Compound AI Systems for E-Commerce

Sherry Ruan, Tian Zhao

TL;DR

Real-world e-commerce demands scalable, low-latency AI across a globally distributed user base, but monolithic LLMs struggle with data locality, read-heavy workloads, and long-tail user needs. JungleGPT proposes a compound AI architecture with a Copilot on the critical path, edge-backed Caching Nodes, and asynchronously updated LLM Nodes to separate data access from inference. The design integrates on-path fast models, edge caches, and language-aware small LLMs with rerankers to achieve significant cost and latency benefits. The work offers a practical, scalable path for deploying AI in e-commerce and highlights a design blueprint for future compound AI systems.

Abstract

LLMs have significantly advanced the e-commerce industry by powering applications such as personalized recommendations and customer service. However, most current efforts focus solely on monolithic LLMs and fall short in addressing the complexity and scale of real-world e-commerce scenarios. In this work, we present JungleGPT, the first compound AI system tailored for real-world e-commerce applications. We outline the system's design and the techniques used to optimize its performance for practical use cases, which have proven to reduce inference costs to less than 1% of what they would be with a powerful, monolithic LLM.

JungleGPT: Designing and Optimizing Compound AI Systems for E-Commerce

TL;DR

Real-world e-commerce demands scalable, low-latency AI across a globally distributed user base, but monolithic LLMs struggle with data locality, read-heavy workloads, and long-tail user needs. JungleGPT proposes a compound AI architecture with a Copilot on the critical path, edge-backed Caching Nodes, and asynchronously updated LLM Nodes to separate data access from inference. The design integrates on-path fast models, edge caches, and language-aware small LLMs with rerankers to achieve significant cost and latency benefits. The work offers a practical, scalable path for deploying AI in e-commerce and highlights a design blueprint for future compound AI systems.

Abstract

LLMs have significantly advanced the e-commerce industry by powering applications such as personalized recommendations and customer service. However, most current efforts focus solely on monolithic LLMs and fall short in addressing the complexity and scale of real-world e-commerce scenarios. In this work, we present JungleGPT, the first compound AI system tailored for real-world e-commerce applications. We outline the system's design and the techniques used to optimize its performance for practical use cases, which have proven to reduce inference costs to less than 1% of what they would be with a powerful, monolithic LLM.
Paper Structure (5 sections, 1 figure)

This paper contains 5 sections, 1 figure.

Figures (1)

  • Figure 1: JungleGPT Compound AI System Design