Profit Maximization for a Robotics-as-a-Service Model
Joo Seung Lee, Anil Aswani
TL;DR
This work tackles profit maximization for a Robotics-as-a-Service operator managing a single robot under sequential customer demand. It integrates survival-analysis–based degradation modeling with inverse-optimization–driven learning to inform an MDP-based policy for joint pricing and robot replacement, with a three-phase online learning framework that first learns customer utilities, then degradation dynamics, and finally the optimal control policy. Empirical results from a discrete-time simulator demonstrate that the approach yields near-optimal profitability, with faster convergence for utility estimates than for degradation, and interpretable policy structure that balances revenue, holding costs, failures, and replacement costs. The framework offers a principled, data-driven method to manage pricing and lifecycle decisions in RaaS, with potential extensions to richer customer behaviors and stochastic decision rules.
Abstract
The growth of Robotics-as-a-Service (RaaS) presents new operational challenges, particularly in optimizing business decisions like pricing and equipment management. While much research focuses on the technical aspects of RaaS, the strategic business problems of joint pricing and replacement have been less explored. This paper addresses the problem of profit maximization for an RaaS operator operating a single robot at a time. We formulate a model where jobs arrive sequentially, and for each, the provider must decide on a price, which the customer can accept or reject. Upon job completion, the robot undergoes stochastic degradation, increasing its probability of failure in future tasks. The operator must then decide whether to replace the robot, balancing replacement costs against future revenue potential and holding costs. To solve this complex sequential decision-making problem, we develop a framework that integrates data-driven estimation techniques inspired by survival analysis and inverse optimization to learn models of customer behavior and robot failure. These models are used within a Markov decision process (MDP) framework to compute an optimal policy for joint pricing and replacement. Numerical experiments demonstrate the efficacy of our approach in maximizing profit by adaptively managing pricing and robot lifecycle decisions.
