Table of Contents
Fetching ...

Cortex AISQL: A Production SQL Engine for Unstructured Data

Paweł Liskowski, Benjamin Han, Paritosh Aggarwal, Bowei Chen, Boxin Jiang, Nitish Jindal, Zihan Li, Aaron Lin, Kyle Schmaus, Jay Tayade, Weicheng Zhao, Anupam Datta, Nathan Wiegand, Dimitris Tsirogiannis

TL;DR

AISQL addresses the gap between structured SQL and unstructured data by integrating LLM-driven semantic operators directly into SQL. It introduces six operators (AI_COMPLETE, AI_FILTER, AI_JOIN, AI_CLASSIFY, AI_AGG, AI_SUMMARIZE_AGG) and a Cortex-based production architecture, augmented by three techniques: AI-aware query optimization, adaptive model cascades, and query rewriting for semantic joins. Production workload analysis shows AI operators dominate costs and many queries are multi-table, motivating cost-aware planning and join-optimization techniques. Experimental results demonstrate substantial speedups and maintained quality, enabling end-to-end analytics with no data movement across hybrid data assets.

Abstract

Snowflake's Cortex AISQL is a production SQL engine that integrates native semantic operations directly into SQL. This integration allows users to write declarative queries that combine relational operations with semantic reasoning, enabling them to query both structured and unstructured data effortlessly. However, making semantic operations efficient at production scale poses fundamental challenges. Semantic operations are more expensive than traditional SQL operations, possess distinct latency and throughput characteristics, and their cost and selectivity are unknown during query compilation. Furthermore, existing query engines are not designed to optimize semantic operations. The AISQL query execution engine addresses these challenges through three novel techniques informed by production deployment data from Snowflake customers. First, AI-aware query optimization treats AI inference cost as a first-class optimization objective, reasoning about large language model (LLM) cost directly during query planning to achieve 2-8$\times$ speedups. Second, adaptive model cascades reduce inference costs by routing most rows through a fast proxy model while escalating uncertain cases to a powerful oracle model, achieving 2-6$\times$ speedups while maintaining 90-95% of oracle model quality. Third, semantic join query rewriting lowers the quadratic time complexity of join operations to linear through reformulation as multi-label classification tasks, achieving 15-70$\times$ speedups with often improved prediction quality. AISQL is deployed in production at Snowflake, where it powers diverse customer workloads across analytics, search, and content understanding.

Cortex AISQL: A Production SQL Engine for Unstructured Data

TL;DR

AISQL addresses the gap between structured SQL and unstructured data by integrating LLM-driven semantic operators directly into SQL. It introduces six operators (AI_COMPLETE, AI_FILTER, AI_JOIN, AI_CLASSIFY, AI_AGG, AI_SUMMARIZE_AGG) and a Cortex-based production architecture, augmented by three techniques: AI-aware query optimization, adaptive model cascades, and query rewriting for semantic joins. Production workload analysis shows AI operators dominate costs and many queries are multi-table, motivating cost-aware planning and join-optimization techniques. Experimental results demonstrate substantial speedups and maintained quality, enabling end-to-end analytics with no data movement across hybrid data assets.

Abstract

Snowflake's Cortex AISQL is a production SQL engine that integrates native semantic operations directly into SQL. This integration allows users to write declarative queries that combine relational operations with semantic reasoning, enabling them to query both structured and unstructured data effortlessly. However, making semantic operations efficient at production scale poses fundamental challenges. Semantic operations are more expensive than traditional SQL operations, possess distinct latency and throughput characteristics, and their cost and selectivity are unknown during query compilation. Furthermore, existing query engines are not designed to optimize semantic operations. The AISQL query execution engine addresses these challenges through three novel techniques informed by production deployment data from Snowflake customers. First, AI-aware query optimization treats AI inference cost as a first-class optimization objective, reasoning about large language model (LLM) cost directly during query planning to achieve 2-8 speedups. Second, adaptive model cascades reduce inference costs by routing most rows through a fast proxy model while escalating uncertain cases to a powerful oracle model, achieving 2-6 speedups while maintaining 90-95% of oracle model quality. Third, semantic join query rewriting lowers the quadratic time complexity of join operations to linear through reformulation as multi-label classification tasks, achieving 15-70 speedups with often improved prediction quality. AISQL is deployed in production at Snowflake, where it powers diverse customer workloads across analytics, search, and content understanding.

Paper Structure

This paper contains 22 sections, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: Snowflake architecture with the Cortex Platform for AI inference. The Cortex Platform adds Inference Engines, Scheduler, and API Service components to support both interactive and batch AISQL workloads.
  • Figure 2: Percentage composition of AISQL workloads by statement type. SELECT queries constitute the majority of production workloads.
  • Figure 3: AISQL query execution time distribution by number of tables. Multi-table queries exhibit higher execution times than single-table queries.
  • Figure 4: Cost breakdown of AISQL queries. Percentage of total credit usage by statement type, showing the relative contributions of model inference (AI credits) and relational processing (Warehouse credits).
  • Figure 5: Distribution of tables used in AISQL queries. Most AISQL queries ($61\%$) involve a single table, while multi-table queries account for nearly $40\%$ of workloads.
  • ...and 7 more figures