Table of Contents
Fetching ...

Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

Georgios L. Stavrinides, Helen D. Karatza

TL;DR

The chapter surveys scheduling of data-intensive workloads in large-scale distributed systems, emphasizing data locality, QoS, and energy efficiency. It classifies workloads into fine-grained, coarse-grained (workflow), and embarrassingly parallel (BoT) categories and reviews corresponding scheduling strategies such as gang scheduling, workflow heuristics, and BoT heuristics, alongside data-locality techniques like MapReduce/Hadoop. Major challenges include exploiting data locality, meeting time constraints, ensuring fault tolerance, and reducing energy consumption, with service-level agreements guiding trade-offs. The authors discuss recent trends like VM live migrations and approximate computations combined with bin packing, checkpointing, and DVFS, offering a roadmap for future research and practical deployment in diverse cloud and grid environments.

Abstract

With the explosive growth of big data, workloads tend to get more complex and computationally demanding. Such applications are processed on distributed interconnected resources that are becoming larger in scale and computational capacity. Data-intensive applications may have different degrees of parallelism and must effectively exploit data locality. Furthermore, they may impose several Quality of Service requirements, such as time constraints and resilience against failures, as well as other objectives, like energy efficiency. These features of the workloads, as well as the inherent characteristics of the computing resources required to process them, present major challenges that require the employment of effective scheduling techniques. In this chapter, a classification of data-intensive workloads is proposed and an overview of the most commonly used approaches for their scheduling in large-scale distributed systems is given. We present novel strategies that have been proposed in the literature and shed light on open challenges and future directions.

Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

TL;DR

The chapter surveys scheduling of data-intensive workloads in large-scale distributed systems, emphasizing data locality, QoS, and energy efficiency. It classifies workloads into fine-grained, coarse-grained (workflow), and embarrassingly parallel (BoT) categories and reviews corresponding scheduling strategies such as gang scheduling, workflow heuristics, and BoT heuristics, alongside data-locality techniques like MapReduce/Hadoop. Major challenges include exploiting data locality, meeting time constraints, ensuring fault tolerance, and reducing energy consumption, with service-level agreements guiding trade-offs. The authors discuss recent trends like VM live migrations and approximate computations combined with bin packing, checkpointing, and DVFS, offering a roadmap for future research and practical deployment in diverse cloud and grid environments.

Abstract

With the explosive growth of big data, workloads tend to get more complex and computationally demanding. Such applications are processed on distributed interconnected resources that are becoming larger in scale and computational capacity. Data-intensive applications may have different degrees of parallelism and must effectively exploit data locality. Furthermore, they may impose several Quality of Service requirements, such as time constraints and resilience against failures, as well as other objectives, like energy efficiency. These features of the workloads, as well as the inherent characteristics of the computing resources required to process them, present major challenges that require the employment of effective scheduling techniques. In this chapter, a classification of data-intensive workloads is proposed and an overview of the most commonly used approaches for their scheduling in large-scale distributed systems is given. We present novel strategies that have been proposed in the literature and shed light on open challenges and future directions.

Paper Structure

This paper contains 34 sections, 9 figures, 1 table.

Figures (9)

  • Figure 1: Typical parameters that characterize a task of an application submitted for execution in a large-scale distributed system.
  • Figure 2: An example of a fine-grained parallel application. The frequently communicating tasks of the application form a gang of $N$ parallel tasks. The communication between the tasks is depicted with arrows.
  • Figure 3: Example of gang scheduling in a system with three processors $p_{1}$, $p_{2}$ and $p_{3}$. The first gang consists of the tasks $n_{1}^{1}$ and $n_{2}^{1}$, scheduled on processors $p_{1}$ and $p_{2}$, respectively. The second gang consists of the tasks $n_{1}^{2}$, $n_{2}^{2}$ and $n_{3}^{2}$, scheduled on processors $p_{1}$, $p_{2}$ and $p_{3}$, respectively. The third gang consists of the tasks $n_{1}^{3}$ and $n_{2}^{3}$, scheduled on processors $p_{2}$ and $p_{3}$, respectively. It can be observed that the processor $p_{3}$ remains idle during the execution of the tasks $n_{1}^{1}$ and $n_{2}^{1}$ of the first gang. This is due to the fact that the task $n_{3}^{2}$ at the head of its queue cannot start execution, because according to the gang scheduling technique, it must start execution at the same time as the other tasks of its gang, $n_{1}^{2}$ and $n_{2}^{2}$, which are scheduled on the other processors that are currently busy.
  • Figure 4: An example of a coarse-grained parallel application (workflow application), represented as a Directed Acyclic Graph (DAG). The number in each node denotes the computational cost of the represented task. The number on each edge denotes the communication cost between the two tasks that it connects. The critical path of the DAG is depicted with thick arrows.
  • Figure 5: An embarrassingly parallel application, consisting of $N$ independent parallel tasks. Such applications are commonly referred to as Bag-of-Tasks (BoT) applications.
  • ...and 4 more figures