Value Iteration for Learning Concurrently Executable Robotic Control Tasks
Sheikh A. Tahmid, Gennaro Notomista
TL;DR
This work tackles the problem of concurrent task execution in redundant robotic systems by introducing task independence for learned cost-to-go functions. It develops a cost-functional formulation and a continuous fitted value iteration (CFVI) approach to train tasks so their gradients are independent or orthogonal, enabling simultaneous execution via a min-norm controller. Theoretical results (including Propositions 2–3) connect independence and orthogonality to feasible multi-task control and provide an analytic optimal input form, $u^* = -\tfrac{1}{2} R(x)^{-1} (L_g J(x))^T$, for the learned costs. Empirical results across multiple mobile-robot scenarios, with both simulated and physical experiments, demonstrate improved concurrency and the ability to adapt task stacks online, offering a practical pathway to robust multi-task robotic control.
Abstract
Many modern robotic systems such as multi-robot systems and manipulators exhibit redundancy, a property owing to which they are capable of executing multiple tasks. This work proposes a novel method, based on the Reinforcement Learning (RL) paradigm, to train redundant robots to be able to execute multiple tasks concurrently. Our approach differs from typical multi-objective RL methods insofar as the learned tasks can be combined and executed in possibly time-varying prioritized stacks. We do so by first defining a notion of task independence between learned value functions. We then use our definition of task independence to propose a cost functional that encourages a policy, based on an approximated value function, to accomplish its control objective while minimally interfering with the execution of higher priority tasks. This allows us to train a set of control policies that can be executed simultaneously. We also introduce a version of fitted value iteration to learn to approximate our proposed cost functional efficiently. We demonstrate our approach on several scenarios and robotic systems.
