Table of Contents
Fetching ...

SkyHOST: A Unified Architecture for Cross-Cloud Hybrid Object and Stream Transfer

Muhammad Arslan Tariq, Grégoire Danoy, Pascal Bouvry

Abstract

Cloud and big data workloads are increasingly distributing data across multiple cloud providers and regions for rapid decision-making and analytics. Traditional transfer tools are typically specialized for a single paradigm, either stream replication or bulk transfer. This specialization forces users to deploy and manage separate systems with different configurations for each transfer pattern. This paper presents SkyHOST (Hybrid Object and Stream Transfer), a unified data movement architecture built upon the Skyplane framework to bridge the gap between bulk object transfer and streaming workloads through a single control plane and CLI. SkyHOST manages URI-based routing to automatically select the appropriate transfer mechanism, supporting both structured data for record-level ingestion and chunk-based transfer for large binary objects. We demonstrate, through an environmental monitoring use case and empirical evaluation, that SkyHOST provides operational simplicity by consolidating heterogeneous data movement patterns under a single control plane while achieving competitive throughput for cross-region transfers.

SkyHOST: A Unified Architecture for Cross-Cloud Hybrid Object and Stream Transfer

Abstract

Cloud and big data workloads are increasingly distributing data across multiple cloud providers and regions for rapid decision-making and analytics. Traditional transfer tools are typically specialized for a single paradigm, either stream replication or bulk transfer. This specialization forces users to deploy and manage separate systems with different configurations for each transfer pattern. This paper presents SkyHOST (Hybrid Object and Stream Transfer), a unified data movement architecture built upon the Skyplane framework to bridge the gap between bulk object transfer and streaming workloads through a single control plane and CLI. SkyHOST manages URI-based routing to automatically select the appropriate transfer mechanism, supporting both structured data for record-level ingestion and chunk-based transfer for large binary objects. We demonstrate, through an environmental monitoring use case and empirical evaluation, that SkyHOST provides operational simplicity by consolidating heterogeneous data movement patterns under a single control plane while achieving competitive throughput for cross-region transfers.
Paper Structure (33 sections, 5 equations, 6 figures, 4 tables)

This paper contains 33 sections, 5 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: SkyHOST Unified System Architecture for Hybrid Object and Stream Transfer
  • Figure 2: SkyHOST end-to-end data flow to support object-to-stream and stream-to-stream transfers
  • Figure 3: Comparing the analytical model estimation with actual measurements in Kafka-to-Kafka replication as message size varies from 1 KB to 1000 KB
  • Figure 4: Kafka-to-Kafka replication throughput comparison between SkyHOST and Confluent's Kafka Replicator across varying partition counts (100 KB messages, 32 MB batching)
  • Figure 5: Comparing the estimation for the analytical model with actual measurements in S3-to-Kafka transfer as chunk size varies from 1 MB to 96 MB
  • ...and 1 more figures