Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part I

Yi Tian; Kaiqing Zhang; Russ Tedrake; Suvrit Sra

Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part I

Yi Tian, Kaiqing Zhang, Russ Tedrake, Suvrit Sra

TL;DR

Finite-sample guarantees of finding a near-optimal state representation function and a near-optimal controller using the directly learned latent model, for finite-horizon time-varying LQG control problems are established.

Abstract

We study the task of learning state representations from potentially high-dimensional observations, with the goal of controlling an unknown partially observable system. We pursue a cost-driven approach, where a dynamic model in some latent state space is learned by predicting the costs without predicting the observations or actions. In particular, we focus on an intuitive cost-driven state representation learning method for solving Linear Quadratic Gaussian (LQG) control, one of the most fundamental partially observable control problems. As our main results, we establish finite-sample guarantees of finding a near-optimal state representation function and a near-optimal controller using the directly learned latent model, for finite-horizon time-varying LQG control problems. To the best of our knowledge, despite various empirical successes, finite-sample guarantees of such a cost-driven approach remain elusive. Our result underscores the value of predicting multi-step costs, an idea that is key to our theory, and notably also an idea that is known to be empirically valuable for learning state representations. A second part of this work, that is to appear as Part II, addresses the infinite-horizon linear time-invariant setting; it also extends the results to an approach that implicitly learns the latent dynamics, inspired by the recent empirical breakthrough of MuZero in model-based reinforcement learning.

Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part I

TL;DR

Abstract

Paper Structure (17 sections, 19 theorems, 248 equations, 3 algorithms)

This paper contains 17 sections, 19 theorems, 248 equations, 3 algorithms.

Introduction
Problem setup
Latent model of finite-horizon time-varying LQG
Methodology: Cost-driven state representation learning
Learning the state representation function
Theoretical guarantees and proofs
Proposition on multi-step cumulative costs
Quadratic regression bound
Matrix factorization bound
Perturbed linear regression bound
Proof of Claim \ref{['clm:a-norm']}
Certainty equivalent linear quadratic control
Proof of Claim \ref{['clm:coverage']}
Proof of Claim \ref{['clm:xi-exi']}
Proof of Claim \ref{['clm:exi-0']}
...and 2 more sections

Key Result

Proposition 1

Let $(z^{\ast}_t)_{t=0}^{T}$ be state estimates given by the Kalman filter. Then, where $L^{\ast}_{t+1} i_{t+1}$ is independent of $z^{\ast}_t$ and $u_t$, i.e., the state estimates follow the same linear dynamics as the underlying state, with noises $L^{\ast}_{t+1} i_{t+1}$. The cost at step $t$ can then be reformulated as functions of the state estimates by where $b_t > 0$ is a problem-dependen

Theorems & Definitions (40)

Proposition 1
proof
Proposition 2
proof
Theorem 1
Proposition 3
proof
Lemma 1
proof
Lemma 2: Quadratic regression
...and 30 more

Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part I

TL;DR

Abstract

Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part I

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (40)