Table of Contents
Fetching ...

Premature Dimensional Collapse and Tensor-based Execution Paths for High-Dimensional Relational Operations in Cost-Based Database Systems

Il-Sun Chang

TL;DR

This work identifies a structural failure mode in which intermediate representations are prematurely linearized under memory pressure, causing disproportionate I/O amplification and phase-transition-like latency behavior, and proposes a tensor-based execution path that delays premature linearization and preserves higher-dimensional locality through late materialization and structured intermediate layouts.

Abstract

Modern cost-based DBMSs frequently exhibit execution instability and tail-latency amplification when high-dimensional relational operations trigger memory-regime transitions such as hash-table spilling and external materialization. We identify a structural failure mode in which intermediate representations are prematurely linearized under memory pressure, causing disproportionate I/O amplification and phase-transition-like latency behavior. To mitigate this, we propose a tensor-based execution path that delays premature linearization and preserves higher-dimensional locality through late materialization and structured intermediate layouts. Using a modified PostgreSQL-based prototype and controlled microbenchmarks, we show that under constrained memory settings (e.g., work_mem=1MB) conventional execution can spill hundreds of megabytes and exceed multi-second P99 latency, while the proposed path maintains stable execution and reduces P99 latency to sub-second levels. Our results suggest that representation timing is a first-class design variable for execution stability, complementing traditional optimization efforts focused on cardinality estimation and operator throughput.

Premature Dimensional Collapse and Tensor-based Execution Paths for High-Dimensional Relational Operations in Cost-Based Database Systems

TL;DR

This work identifies a structural failure mode in which intermediate representations are prematurely linearized under memory pressure, causing disproportionate I/O amplification and phase-transition-like latency behavior, and proposes a tensor-based execution path that delays premature linearization and preserves higher-dimensional locality through late materialization and structured intermediate layouts.

Abstract

Modern cost-based DBMSs frequently exhibit execution instability and tail-latency amplification when high-dimensional relational operations trigger memory-regime transitions such as hash-table spilling and external materialization. We identify a structural failure mode in which intermediate representations are prematurely linearized under memory pressure, causing disproportionate I/O amplification and phase-transition-like latency behavior. To mitigate this, we propose a tensor-based execution path that delays premature linearization and preserves higher-dimensional locality through late materialization and structured intermediate layouts. Using a modified PostgreSQL-based prototype and controlled microbenchmarks, we show that under constrained memory settings (e.g., work_mem=1MB) conventional execution can spill hundreds of megabytes and exceed multi-second P99 latency, while the proposed path maintains stable execution and reduces P99 latency to sub-second levels. Our results suggest that representation timing is a first-class design variable for execution stability, complementing traditional optimization efforts focused on cardinality estimation and operator throughput.
Paper Structure (20 sections, 2 equations, 7 figures)

This paper contains 20 sections, 2 equations, 7 figures.

Figures (7)

  • Figure 1: Scalability collapse of CPU Hash Join
  • Figure 2: Premature dimensional collapse caused by early execution decisions
  • Figure 3: Growth of Intermediate Hash Table Due to Premature Linearization
  • Figure 4: Tail Latency Analysis: CPU Execution Stability
  • Figure 5: Impact of High-Dimensional Sorting (Single vs Multi-Key)
  • ...and 2 more figures