Tianshou: a Highly Modularized Deep Reinforcement Learning Library
Jiayi Weng, Huayu Chen, Dong Yan, Kaichao You, Alexis Duburcq, Minghao Zhang, Yi Su, Hang Su, Jun Zhu
TL;DR
Tianshou tackles the rigidity and fragmentation of DRL tooling by delivering a highly modular PyTorch-based library built from reusable building blocks. It standardizes training into on-policy, off-policy, and offline paradigms and offers a four-layer architecture plus a comprehensive MuJoCo benchmark that shows ~15% higher median performance than reference implementations. The work emphasizes reliability, cross-platform usability, and practical tooling (replay buffers, collectors, EnvPool, logging), enabling fast prototyping and robust comparisons in small- to mid-scale DRL research. Overall, Tianshou aims to accelerate research workflows by providing a flexible, well-tested infrastructure with a transparent, open benchmark suite.
Abstract
In this paper, we present Tianshou, a highly modularized Python library for deep reinforcement learning (DRL) that uses PyTorch as its backend. Tianshou intends to be research-friendly by providing a flexible and reliable infrastructure of DRL algorithms. It supports online and offline training with more than 20 classic algorithms through a unified interface. To facilitate related research and prove Tianshou's reliability, we have released Tianshou's benchmark of MuJoCo environments, covering eight classic algorithms with state-of-the-art performance. We open-sourced Tianshou at https://github.com/thu-ml/tianshou/.
