Partial Adaptive Indexing for Approximate Query Answering

Stavros Maroulis; Nikos Bikakis; Vassilis Stamatopoulos; George Papastefanatos

Partial Adaptive Indexing for Approximate Query Answering

Stavros Maroulis, Nikos Bikakis, Vassilis Stamatopoulos, George Papastefanatos

TL;DR

Problem: interactive exploration of very large raw data files requires fast response times, often at the cost of exact results. Approach: the paper proposes Partial Adaptive Indexing that uses a hierarchical tile-based index with per-tile aggregates and a confidence-interval framework to provide approximate query answers and guide selective index refinement, operating in-situ on raw data. Contributions: a formal optimization for selecting a subset of partially contained tiles to process under an error bound $\phi$, a scoring policy $s(t)=\alpha \cdot w(t) + (1-\alpha)/\text{count}(t \cap Q)$, and a preliminary evaluation demonstrating speedups. Findings: preliminary results indicate substantial speedups during early exploration, with manageable trade-offs between accuracy and latency.

Abstract

In data exploration, users need to analyze large data files quickly, aiming to minimize data-to-analysis time. While recent adaptive indexing approaches address this need, they are cases where demonstrate poor performance. Particularly, during the initial queries, in regions with a high density of objects, and in very large files over commodity hardware. This work introduces an approach for adaptive indexing driven by both query workload and user-defined accuracy constraints to support approximate query answering. The approach is based on partial index adaptation which reduces the costs associated with reading data files and refining indexes. We leverage a hierarchical tile-based indexing scheme and its stored metadata to provide efficient query evaluation, ensuring accuracy within user-specified bounds. Our preliminary evaluation demonstrates improvement on query evaluation time, especially during initial user exploration.

Partial Adaptive Indexing for Approximate Query Answering

TL;DR

, a scoring policy

, and a preliminary evaluation demonstrating speedups. Findings: preliminary results indicate substantial speedups during early exploration, with manageable trade-offs between accuracy and latency.

Abstract

Paper Structure (7 sections, 1 equation, 2 figures)

This paper contains 7 sections, 1 equation, 2 figures.

Introduction
Framework Overview
Exploration Model
Indexing Scheme
Partial Index Adaptation for Approximate Query Answering
Approach Overview
Preliminary Evaluation

Figures (2)

Figure 1: Index Adaptation Example (a) Initial index structure; (b) Exact query answering, splitting tiles $t_1$ and $t_3$; (c) Approximate query answering, splitting only $t_3$ and providing results within user accuracy constraints
Figure 2: Evaluation Time for Different Error Bounds

Partial Adaptive Indexing for Approximate Query Answering

TL;DR

Abstract

Partial Adaptive Indexing for Approximate Query Answering

Authors

TL;DR

Abstract

Table of Contents

Figures (2)