How did you learn to navigate the neighborhood of your childhood, to go to a friend’s house, to your school or to the grocery store? Probably without a map and simply by remembering the visual appearance of streets and turns along the way. As you gradually explored your neighborhood, you grew more confident, mastered your whereabouts and learned new and increasingly complex paths. You may have gotten briefly lost, but found your way again thanks to landmarks, or perhaps even by looking to the sun for an impromptu compass.

Navigation is an important cognitive task that enables humans and animals to traverse, without maps, over long distances in a complex world. Such long-range navigation can simultaneously support self-localisation (“I am here”) and a representation of the goal (“I am going there”).

In Learning to Navigate in Cities Without a Map, we present an interactive navigation environment that uses first-person perspective photographs from Google Street View, approved for use by the StreetLearn project and academic research, and gamify that environment to train an AI. As standard with Street View images, faces and license plates have been blurred and are unrecognisable. We build a neural network-based artificial agent that learns to navigate multiple cities using visual information (pixels from a Street View image). Note that this research is about navigation in general rather than driving; we did not use traffic information nor try to model vehicle control.