Benchmarking Different Application Types across Heterogeneous Cloud Compute Services

Nivedhitha Duggi; Masoud Rafiei; Mohsen Amini Salehi

Benchmarking Different Application Types across Heterogeneous Cloud Compute Services

Nivedhitha Duggi, Masoud Rafiei, Mohsen Amini Salehi

TL;DR

This paper benchmarks heterogeneous cloud compute across two cloud providers (AWS and Chameleon) for three application domains: DNN inference in industrial Oil & Gas, ML inference for assistive technology, and video transcoding. It combines dataset-driven workloads with multiple VM types to quantify performance variability and informs resource allocation decisions. The authors employ statistical analyses (Shapiro-Wilk and Kolmogorov-Smirnov tests) and report means/standard deviations to characterize inference times, alongside FFmpeg-based video transcoding benchmarks across single- and multi-parameter scenarios, including merging tasks. The work provides a public, reproducible benchmark suite and datasets that enable researchers to assess latency, throughput, and energy-conscious deployment strategies on heterogeneous cloud infrastructures.

Abstract

Infrastructure as a Service (IaaS) clouds have become the predominant underlying infrastructure for the operation of modern and smart technology. IaaS clouds have proven to be useful for multiple reasons such as reduced costs, increased speed and efficiency, and better reliability and scalability. Compute services offered by such clouds are heterogeneous -- they offer a set of architecturally diverse machines that fit efficiently executing different workloads. However, there has been little study to shed light on the performance of popular application types on these heterogeneous compute servers across different clouds. Such a study can help organizations to optimally (in terms of cost, latency, throughput, consumed energy, carbon footprint, etc.) employ cloud compute services. At HPCC lab, we have focused on such benchmarks in different research projects and, in this report, we curate those benchmarks in a single document to help other researchers in the community using them. Specifically, we introduce our benchmarks datasets for three application types in three different domains, namely: Deep Neural Networks (DNN) Inference for industrial applications, Machine Learning (ML) Inference for assistive technology applications, and video transcoding for multimedia use cases.

Benchmarking Different Application Types across Heterogeneous Cloud Compute Services

TL;DR

Abstract

Paper Structure (30 sections, 7 figures, 5 tables)

This paper contains 30 sections, 7 figures, 5 tables.

Overview
Benchmarking Structure
Application Types
Heterogeneous Resources in Cloud Platforms
Benchmark I: DNN Applications in Industrial (Oil & Gas) Use Case
Overview
DNN Inference Time for O&G Applications
Cloud Platforms Used for Benchmarking the Industrial Applications
Benchmarking Methodology for Industrial Applications on Heterogeneous Clouds
Analysis of Inference Time of the Applications
Shapiro-Wilk Test for Normality of the Data
Kolmogorov-Smirnoff Goodness of Fit Test
Mean and Standard Deviation of Inference Executive Times
Benchmark II: Machine Learning Inference Benchmarking on Heterogeneous Cloud Resources for Assistive Technology
Overview
...and 15 more sections

Figures (7)

Figure 1: Benchmarking structure
Figure 2: Chart comparing the Image Classification Inference times across the Machines
Figure 3: The average inference times for Object Detection across the different AWS VM types.
Figure 4: Average inference times of different machines performing questions answering tasks.
Figure 5: Average inference times of heterogeneous AWS machines performing speech recognition.
...and 2 more figures

Benchmarking Different Application Types across Heterogeneous Cloud Compute Services

TL;DR

Abstract

Benchmarking Different Application Types across Heterogeneous Cloud Compute Services

Authors

TL;DR

Abstract

Table of Contents

Figures (7)