Table of Contents
Fetching ...

Coffee: Cost-Effective Edge Caching for 360 Degree Live Video Streaming

Chen Li, Tingwei Ye, Tongyu Zong, Liyang Sun, Houwei Cao, Yong Liu

TL;DR

The paper tackles bandwidth and latency challenges in live 360° video by introducing Coffee, an edge caching framework that uses collaborative FoV prediction and predictive tile prefetching to achieve high tile-hit gains with small edge storage. It extends to TransCoffee, a transcoding-aware variant that quantifies the joint benefits of caching and edge transcoding using a Shapley-value-based gain model. Through real-trace experiments, Coffee achieves significant backhaul reductions (up to 76%), while TransCoffee delivers substantial streaming-cost savings (up to 63%) compared to state-of-the-art baselines. The work demonstrates practical, real-time viable edge caching for live 360° streaming, enabling scalable QoE improvements under constrained edge resources.

Abstract

While live 360 degree video streaming delivers immersive viewing experience, it poses significant bandwidth and latency challenges for content delivery networks. Edge servers are expected to play an important role in facilitating live streaming of 360 degree videos. In this paper, we propose a novel predictive edge caching algorithm (Coffee) for live 360 degree video that employ collaborative FoV prediction and predictive tile prefetching to reduce bandwidth consumption, streaming cost and improve the streaming quality and robustness. Our light-weight caching algorithms exploit the unique tile consumption patterns of live 360 degree video streaming to achieve high tile caching gains. Through extensive experiments driven by real 360 degree video streaming traces, we demonstrate that edge caching algorithms specifically designed for live 360 degree video streaming can achieve high streaming cost savings with small edge cache space consumption. Coffee, guided by viewer FoV predictions, significantly reduces back-haul traffic up to 76% compared to state-of-the-art edge caching algorithms. Furthermore, we develop a transcoding-aware variant (TransCoffee) and evaluate it using comprehensive experiments, which demonstrate that TransCoffee can achieve 63\% lower cost compared to state-of-the-art transcoding-aware approaches.

Coffee: Cost-Effective Edge Caching for 360 Degree Live Video Streaming

TL;DR

The paper tackles bandwidth and latency challenges in live 360° video by introducing Coffee, an edge caching framework that uses collaborative FoV prediction and predictive tile prefetching to achieve high tile-hit gains with small edge storage. It extends to TransCoffee, a transcoding-aware variant that quantifies the joint benefits of caching and edge transcoding using a Shapley-value-based gain model. Through real-trace experiments, Coffee achieves significant backhaul reductions (up to 76%), while TransCoffee delivers substantial streaming-cost savings (up to 63%) compared to state-of-the-art baselines. The work demonstrates practical, real-time viable edge caching for live 360° streaming, enabling scalable QoE improvements under constrained edge resources.

Abstract

While live 360 degree video streaming delivers immersive viewing experience, it poses significant bandwidth and latency challenges for content delivery networks. Edge servers are expected to play an important role in facilitating live streaming of 360 degree videos. In this paper, we propose a novel predictive edge caching algorithm (Coffee) for live 360 degree video that employ collaborative FoV prediction and predictive tile prefetching to reduce bandwidth consumption, streaming cost and improve the streaming quality and robustness. Our light-weight caching algorithms exploit the unique tile consumption patterns of live 360 degree video streaming to achieve high tile caching gains. Through extensive experiments driven by real 360 degree video streaming traces, we demonstrate that edge caching algorithms specifically designed for live 360 degree video streaming can achieve high streaming cost savings with small edge cache space consumption. Coffee, guided by viewer FoV predictions, significantly reduces back-haul traffic up to 76% compared to state-of-the-art edge caching algorithms. Furthermore, we develop a transcoding-aware variant (TransCoffee) and evaluate it using comprehensive experiments, which demonstrate that TransCoffee can achieve 63\% lower cost compared to state-of-the-art transcoding-aware approaches.
Paper Structure (22 sections, 15 equations, 9 figures, 3 tables, 2 algorithms)

This paper contains 22 sections, 15 equations, 9 figures, 3 tables, 2 algorithms.

Figures (9)

  • Figure 1: System Workflow of Edge-assisted Live 360 Degree Video Streaming.
  • Figure 2: In a streaming system, if the current time is $\tau$, $u_3$ is the user with the shortest playback latency and will watch segment $t-2$. $u_2$ will watch the segment $t-3$, and $u_1$ will watch the segment $t-6$. If the buffer length is 2, $u_1$ needs to predict FoV for the segment $t-4$, which has been watched by $u_2$ and $u_3$. Therefore, we can use the FoVs of $u_2$ and $u_3$ on the segment $t-4$ for collaborative FoV prediction. The weight used to combine the FoV information from $u_2$ and $u_3$ is based on the similarity between $u_1$ and these two users.
  • Figure 3: The aggregation of scores from $n$ viewers for a tile $c$ yields a caching score $S_a(c,t)$ at the current time $t$. Each viewer has a single contribution $S(u,c,t)$ to the score.
  • Figure 4: Once an aggregated predictive caching score has been obtained, a penalty curve will be applied to penalize scores at extended prediction horizons, as accuracy may decrease at these points.
  • Figure 5: The y-axis in Fig.a is the ratio of overlap between the actual FoV of the viewer and the predicted FoV at different prediction horizon lengths. In Fig.b, we calculate the L2 loss between the predicted tile probability and the tile distribution in the viewer's actual FoV. If a tile is partially overlapped with the FoV, the value for the tile is the fraction of tile pixels in the FoV. If a tile is completely in the FoV, the value is 1. After normalization, we obtain a distribution of all tiles for the user's actual FoV. We compare the L2 loss between these two distributions in Fig.b. Since we have a buffer length of 2 seconds, we only consider the prediction accuracy after the buffer length.
  • ...and 4 more figures