Learning to Navigate in Cities Without a Map

Piotr Mirowski; Matthew Koichi Grimes; Mateusz Malinowski; Karl Moritz Hermann; Keith Anderson; Denis Teplyashin; Karen Simonyan; Koray Kavukcuoglu; Andrew Zisserman; Raia Hadsell

Learning to Navigate in Cities Without a Map

Piotr Mirowski, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

TL;DR

The paper tackles city-scale visual navigation without maps by introducing StreetLearn, a Street View–derived RL environment. It presents a dual-pathway, goal-conditioned architecture ( locale-specific LSTMs plus a general policy) and landmark-based goal representations to enable transfer across cities. Through curriculum learning and transfer experiments, the authors show robust navigation in multiple cities and demonstrate how pre-training on several regions improves adaptation to new ones. The work provides a realistic benchmark and a scalable, modular framework for end-to-end navigation in real-world environments, with resources released for wider use.

Abstract

Navigating through unstructured environments is a basic capability of intelligent creatures, and thus is of fundamental interest in the study and development of artificial intelligence. Long-range navigation is a complex cognitive task that relies on developing an internal representation of space, grounded by recognisable landmarks and robust visual processing, that can simultaneously support continuous self-localisation ("I am here") and a representation of the goal ("I am going there"). Building upon recent research that applies deep reinforcement learning to maze navigation problems, we present an end-to-end deep reinforcement learning approach that can be applied on a city scale. Recognising that successful navigation relies on integration of general policies with locale-specific knowledge, we propose a dual pathway architecture that allows locale-specific features to be encapsulated, while still enabling transfer to multiple cities. We present an interactive navigation environment that uses Google StreetView for its photographic content and worldwide coverage, and demonstrate that our learning method allows agents to learn to navigate multiple cities and to traverse to target destinations that may be kilometres away. The project webpage http://streetlearn.cc contains a video summarising our research and showing the trained agent in diverse city environments and on the transfer task, the form to request the StreetLearn dataset and links to further resources. The StreetLearn environment code is available at https://github.com/deepmind/streetlearn

Learning to Navigate in Cities Without a Map

TL;DR

Abstract

Learning to Navigate in Cities Without a Map

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)