Table of Contents
Fetching ...

Scheduling with Uncertain Holding Costs and its Application to Content Moderation

Caner Gocmen, Thodoris Lykouris, Deeksha Sinha, Wentao Weng

TL;DR

The paper addresses scheduling with uncertain holding costs arising in content moderation, where view-driven costs evolve according to a Markovian tree. It develops the Opportunity-adjusted Remaining Cost (OaRC) index policy, grounded in a Markovian ski-rental analogy and a fluid relaxation, achieving an $\tilde{O}(\sqrt{N})$ regret that vanishes as system size grows and remains independent of the state-space size. A data-driven variant, HOaRC, uses hindsight approximations to predict future costs, enabling practical deployment. Across synthetic and real data, HOaRC reduces policy-violating views and saves reviewer-hours relative to canonical baselines, demonstrating significant operational impact for large-scale human moderation pipelines.

Abstract

In content moderation for social media platforms, the cost of delaying the review of a content is proportional to its view trajectory, which fluctuates and is apriori unknown. Motivated by such uncertain holding costs, we consider a queueing model where job states evolve based on a Markov chain with state-dependent instantaneous holding costs. We demonstrate that in the presence of such uncertain holding costs, the two canonical algorithmic principles, instantaneous-cost ($cμ$-rule) and expected-remaining-cost ($cμ/θ$-rule), are suboptimal. By viewing each job as a Markovian ski-rental problem, we develop a new index-based algorithm, Opportunity-adjusted Remaining Cost (OaRC), that adjusts to the opportunity of serving jobs in the future when uncertainty partly resolves. We show that the regret of OaRC scales as $\tilde{O}(L^{1.5}\sqrt{N})$, where $L$ is the maximum length of a job's holding cost trajectory and $N$ is the system size. This regret bound shows that OaRC achieves asymptotic optimality when the system size $N$ scales to infinity. Moreover, its regret is independent of the state-space size, which is a desirable property when job states contain contextual information. We corroborate our results with an extensive simulation study based on two holding cost patterns (online ads and user-generated content) that arise in content moderation for social media platforms. Our simulations based on synthetic and real datasets demonstrate that OaRC consistently outperforms existing practice, which is based on the two canonical algorithmic principles.

Scheduling with Uncertain Holding Costs and its Application to Content Moderation

TL;DR

The paper addresses scheduling with uncertain holding costs arising in content moderation, where view-driven costs evolve according to a Markovian tree. It develops the Opportunity-adjusted Remaining Cost (OaRC) index policy, grounded in a Markovian ski-rental analogy and a fluid relaxation, achieving an regret that vanishes as system size grows and remains independent of the state-space size. A data-driven variant, HOaRC, uses hindsight approximations to predict future costs, enabling practical deployment. Across synthetic and real data, HOaRC reduces policy-violating views and saves reviewer-hours relative to canonical baselines, demonstrating significant operational impact for large-scale human moderation pipelines.

Abstract

In content moderation for social media platforms, the cost of delaying the review of a content is proportional to its view trajectory, which fluctuates and is apriori unknown. Motivated by such uncertain holding costs, we consider a queueing model where job states evolve based on a Markov chain with state-dependent instantaneous holding costs. We demonstrate that in the presence of such uncertain holding costs, the two canonical algorithmic principles, instantaneous-cost (-rule) and expected-remaining-cost (-rule), are suboptimal. By viewing each job as a Markovian ski-rental problem, we develop a new index-based algorithm, Opportunity-adjusted Remaining Cost (OaRC), that adjusts to the opportunity of serving jobs in the future when uncertainty partly resolves. We show that the regret of OaRC scales as , where is the maximum length of a job's holding cost trajectory and is the system size. This regret bound shows that OaRC achieves asymptotic optimality when the system size scales to infinity. Moreover, its regret is independent of the state-space size, which is a desirable property when job states contain contextual information. We corroborate our results with an extensive simulation study based on two holding cost patterns (online ads and user-generated content) that arise in content moderation for social media platforms. Our simulations based on synthetic and real datasets demonstrate that OaRC consistently outperforms existing practice, which is based on the two canonical algorithmic principles.

Paper Structure

This paper contains 64 sections, 22 theorems, 95 equations, 7 figures, 3 tables, 1 algorithm.

Key Result

Lemma 3.1

For the dual program $D^\star(\gamma)$, (i) it has an optimal solution $\boldsymbol{\beta}^\star(\gamma)$ with $\beta_i^\star(\gamma) = \max\left\{0, c(i) + V^f(\gamma, i) - \gamma\right\}$ and (ii) $D^\star(\gamma) = \mu \cdot \gamma + \lambda\left(c^f(\mathrm{r}) - V(\gamma,\mathrm{r})\right).$

Figures (7)

  • Figure 1: Existing algorithmic principles (instantaneous or expected remaining cost) always serve new Video jobs. Waiting for one period improves our decisions by distinguishing the Video jobs.
  • Figure 2: An example for the water-filling procedure of constructing $(\boldsymbol{q}^{\boldsymbol{o}}, \boldsymbol{\nu}^{\boldsymbol{o}})$. There are six states, $\mathcal{S} = \{1,2,3,4,5,6\}$ with transition probability as given in the leftmost figure. The arrival rate and service rate are $\lambda = 0.8$ and $\mu = 0.7.$ The priority ordering is $(2,3,5,4,6,1).$ States with strips are blocked states (State $2$ is fully, State $5$ is partially). State $4$ is the partially-served state and State $6$ is a partially-reduced state. State $3$ is an empty state and State $1$ is a un-reduced state.
  • Figure 3: View trajectories of five content pieces with highest cumulative views in the three datasets
  • Figure 4: Ads: reduced policy-violating views (%) and reviewer-hour savings by $\textsc{HOaRC}$
  • Figure 5: UGC: reduced policy-violating views (%) and reviewer-hour savings by $\textsc{HOaRC}$
  • ...and 2 more figures

Theorems & Definitions (48)

  • Lemma 3.1
  • Theorem 1
  • Remark 1
  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Lemma 4.5
  • proof : Proof of Theorem \ref{['thm:pafou']}
  • Lemma 4.6
  • ...and 38 more