Table of Contents
Fetching ...

D&A: Resource Optimisation in Personalised PageRank Computations Using Multi-Core Machines

Kai Siong Yow, Chunbo Li

TL;DR

The paper tackles resource optimisation by reframing it as minimizing the number of cores $\mathcal{C}$ needed to process $\mathcal{X}$ personalised PageRank queries within a time bound $\mathcal{T}$ on multi-core machines. It introduces the Divide and Allocate (D&A) framework, which uses a sample-based preprocessing step to estimate per-query times, then partitions the remaining workload into parallel slots and assigns queries to cores to ensure the deadline is met; a real-world variant D&A_Real accounts for restricted cores. A theoretical lower bound via Hoeffding’s inequality provides a baseline for comparison. Empirical evaluation on four large graphs with FORA-based PPR demonstrates substantial reductions in core usage (up to 73.68%), validating the approach for scalable, resource-efficient graph analytics in cloud and data-center environments.

Abstract

Resource optimisation is commonly used in workload management, ensuring efficient and timely task completion utilising available resources. It serves to minimise costs, prompting the development of numerous algorithms tailored to this end. The majority of these techniques focus on scheduling and executing workloads effectively within the provided resource constraints. In this paper, we tackle this problem using another approach. We propose a novel framework D&A to determine the number of cores required in completing a workload under time constraint. We first preprocess a small portion of queries to derive the number of required slots, allowing for the allocation of the remaining workloads into each slot. We introduce a scaling factor in handling the time fluctuation issue caused by random functions. We further establish a lower bound of the number of cores required under this scenario, serving as a baseline for comparison purposes. We examine the framework by computing personalised PageRank values involving intensive computations. Our experimental results show that D&A surpasses the baseline, achieving reductions in the required number of cores ranging from 38.89% to 73.68% across benchmark datasets comprising millions of vertices and edges.

D&A: Resource Optimisation in Personalised PageRank Computations Using Multi-Core Machines

TL;DR

The paper tackles resource optimisation by reframing it as minimizing the number of cores needed to process personalised PageRank queries within a time bound on multi-core machines. It introduces the Divide and Allocate (D&A) framework, which uses a sample-based preprocessing step to estimate per-query times, then partitions the remaining workload into parallel slots and assigns queries to cores to ensure the deadline is met; a real-world variant D&A_Real accounts for restricted cores. A theoretical lower bound via Hoeffding’s inequality provides a baseline for comparison. Empirical evaluation on four large graphs with FORA-based PPR demonstrates substantial reductions in core usage (up to 73.68%), validating the approach for scalable, resource-efficient graph analytics in cloud and data-center environments.

Abstract

Resource optimisation is commonly used in workload management, ensuring efficient and timely task completion utilising available resources. It serves to minimise costs, prompting the development of numerous algorithms tailored to this end. The majority of these techniques focus on scheduling and executing workloads effectively within the provided resource constraints. In this paper, we tackle this problem using another approach. We propose a novel framework D&A to determine the number of cores required in completing a workload under time constraint. We first preprocess a small portion of queries to derive the number of required slots, allowing for the allocation of the remaining workloads into each slot. We introduce a scaling factor in handling the time fluctuation issue caused by random functions. We further establish a lower bound of the number of cores required under this scenario, serving as a baseline for comparison purposes. We examine the framework by computing personalised PageRank values involving intensive computations. Our experimental results show that D&A surpasses the baseline, achieving reductions in the required number of cores ranging from 38.89% to 73.68% across benchmark datasets comprising millions of vertices and edges.
Paper Structure (13 sections, 2 theorems, 9 equations, 3 figures, 1 table, 2 algorithms)

This paper contains 13 sections, 2 theorems, 9 equations, 3 figures, 1 table, 2 algorithms.

Key Result

Lemma 1

For a multi-core machine without any constraint on the number of cores, suppose $\mathcal{X}$ is the number of queries, $s$ is the number of sample queries and $t_{i}$ is the running time to complete the $i^{th}$ sample query. If $t_{max} = \max\{t_{i} \vert i=1,\ldots,s\}$, then the minimum number

Figures (3)

  • Figure 1: An illustration of D&A
  • Figure 2: Results for the minimum number of required cores based on four benchmark datasets, by varying the number $\mathcal{X}$ of queries. The bar chart indicates the processing time whereas the line graphs indicate the number of required cores for D&A_Real (in red) and the theoretical bound in Lemma \ref{['lem:hoeffdings_inequality']} (in blue)
  • Figure 3: A comparison using different scaling factor $d$ for Web-Stanford

Theorems & Definitions (4)

  • Lemma 1
  • proof
  • Lemma 2
  • proof