Table of Contents
Fetching ...

Towards a Flexible Scale-out Framework for Efficient Visual Data Query Processing

Rohit Verma, Arun Raghunath

TL;DR

This paper tackles the challenge of efficiently executing visual data queries that comprise multiple compute- and I/O-intensive operations. It introduces VDMS-Async, an extension of VDMS that supports user-defined operations and remote offload through an asynchronous, event-driven pipeline. The approach yields 2–3x improvements in query time over state-of-the-art systems and achieves linear scale-out when adding remote servers, with up to a 64x reduction in execution time using 64 remotes. Extensive experiments on image (LFW) and video (Kinetics-400) datasets against PostgreSQL, VDMS, and Scanner demonstrate substantial gains, especially for compute-intensive pipelines and under high concurrency. The work lays the groundwork for scalable, end-to-end visual data management with flexible offloading and asynchronous processing, while outlining directions for distributed deployments and smarter offloading decisions.

Abstract

There is growing interest in visual data management systems that support queries with specialized operations ranging from resizing an image to running complex machine learning models. With a plethora of such operations, the basic need to receive query responses in minimal time takes a hit, especially when the client desires to run multiple such operations in a single query. Existing systems provide an ad-hoc approach where different solutions are clubbed together to provide an end-to-end visual data management system. Unlike such solutions, the Visual Data Management System (VDMS) natively executes queries with multiple operations, thus providing an end-to-end solution. However, a fixed subset of native operations and a synchronous threading architecture limit its generality and scalability. In this paper, we develop VDMS-Async that adds the capability to run user-defined operations with VDMS and execute operations within a query on a remote server. VDMS-Async utilizes an event-driven architecture to create an efficient pipeline for executing operations within a query. Our experiments have shown that VDMS-Async reduces the query execution time by 2-3X compared to existing state-of-the-art systems. Further, remote operations coupled with an event-driven architecture enables VDMS-Async to scale query execution time linearly with the addition of every new remote server. We demonstrate a 64X reduction in query execution time when adding 64 remote servers.

Towards a Flexible Scale-out Framework for Efficient Visual Data Query Processing

TL;DR

This paper tackles the challenge of efficiently executing visual data queries that comprise multiple compute- and I/O-intensive operations. It introduces VDMS-Async, an extension of VDMS that supports user-defined operations and remote offload through an asynchronous, event-driven pipeline. The approach yields 2–3x improvements in query time over state-of-the-art systems and achieves linear scale-out when adding remote servers, with up to a 64x reduction in execution time using 64 remotes. Extensive experiments on image (LFW) and video (Kinetics-400) datasets against PostgreSQL, VDMS, and Scanner demonstrate substantial gains, especially for compute-intensive pipelines and under high concurrency. The work lays the groundwork for scalable, end-to-end visual data management with flexible offloading and asynchronous processing, while outlining directions for distributed deployments and smarter offloading decisions.

Abstract

There is growing interest in visual data management systems that support queries with specialized operations ranging from resizing an image to running complex machine learning models. With a plethora of such operations, the basic need to receive query responses in minimal time takes a hit, especially when the client desires to run multiple such operations in a single query. Existing systems provide an ad-hoc approach where different solutions are clubbed together to provide an end-to-end visual data management system. Unlike such solutions, the Visual Data Management System (VDMS) natively executes queries with multiple operations, thus providing an end-to-end solution. However, a fixed subset of native operations and a synchronous threading architecture limit its generality and scalability. In this paper, we develop VDMS-Async that adds the capability to run user-defined operations with VDMS and execute operations within a query on a remote server. VDMS-Async utilizes an event-driven architecture to create an efficient pipeline for executing operations within a query. Our experiments have shown that VDMS-Async reduces the query execution time by 2-3X compared to existing state-of-the-art systems. Further, remote operations coupled with an event-driven architecture enables VDMS-Async to scale query execution time linearly with the addition of every new remote server. We demonstrate a 64X reduction in query execution time when adding 64 remote servers.
Paper Structure (35 sections, 29 figures)

This paper contains 35 sections, 29 figures.

Figures (29)

  • Figure 1: VDMS query with operation pipeline
  • Figure 2: Architecture for User Defined Operations
  • Figure 3: Example User-defined operation
  • Figure 4: Architecture for Remote Operations
  • Figure 5: Example Remote operation
  • ...and 24 more figures