Participants

Twenty-four right-handed, healthy participants (11 female, mean age: 26.25 years, s.d.: 3.52, range: 20–35 years) with no history of neurological disease and normal or corrected to normal vision took part in this experiment. Participants gave written informed consent, and the study was approved by the University College London (UCL) ethics committee.

Eligibility for the experiment was assessed across all participants using two screening criteria: existing knowledge of the Soho testing environment and navigational ability. Only participants who reported minimal or no experience with the environment were invited to take part in the study. Participants were required to score above 3.6 using the Santa Barbara Sense of Direction Scale (one s.d. below the mean score provided by Hegarty et al.41). The mean Santa Barbara Sense of Direction score across participants was 4.89 (s.d. 0.69).

Stimuli and apparatus

An area of London (Soho) including 26 streets was selected as the testing area. This specific area was selected because of its high density of streets and large number of distinct locations such as pubs and shops. Twenty-three goal locations were specified. For details of the area, map and goal locations refer to Howard et al.18. Ten testing and three training routes were defined and filmed (using a HD Sony Z1 and a B Hague camera stabilizer). Videos were edited to create two sets of movie stimuli (Navigation and Control) used in the experiment. During the Navigation videos, onscreen instructions asked participants to actively think about routes, while during the Control videos, onscreen instructions asked participants to press corresponding keys.

Stimuli were presented using MATLAB (v7.5, MathWorks) and the Cogent2000 toolbox (v1.28, www.vislab.ucl.ac.uk/cogent_2000.php). Responses were recorded using a response box positioned under the participant’s right hand.

Procedure

The experiment consisted of studying a training pack, a training session and a testing session. One week before the training session participants were given a training pack to familiarize themselves with the layout of the test area, each streets’ name, and also location and name of goal locations.

The training session happened 1 day before the testing session. During this session, participants were taken on a 2-h tour of the test area in Soho. During this tour their spatial knowledge was rigorously tested and feedback was given to maximize participants’ knowledge of the area. The training route was designed so that (1) it was different with all the filmed routes, (2) each start location (locations at the beginning of the video footages) was visited once and (3) each goal location was passed at least twice and from different directions. When each of these locations was reached the experimenter showed participants the coloured photograph of the start or goal as well as their current position on a map. These coloured photographs were used in the video footage to indicate different locations. Immediately after the tour, participants were tested to assess their knowledge. Immediate feedback was provided to guide participants towards any aspects they should ‘revise’ on the final evening before scanning (for further details refer to Howard et al.18).

The testing session began with a brief training to ensure that participants understood the task requirements. A total of three training routes were viewed (two Navigation and one Control). Each video began with 12 s of fixation cross, followed by a 5 s of a cue word (‘NAVIGATION’ or ‘CONTROL’), indicating the type of the following route. The words ‘NAV’ or ‘CON’ were presented on top of the screen throughout the presentation of the video. The video continued with presentation of a start image with a temporal jitter of 5–13 s. This start image indicated the current location and heading direction. The paths taken in the routes did not match the paths walked during training. Thus, to solve the navigation task participants could not simply recall a previously walked route or sequence of actions, but rather had to construct novel sequences through the space.

Two events were included in the videos during which the video was paused: New Goal Events (NGE) and Decision Points (DP) with 9 and 5 s, respectively. A colour photograph of the new goal was presented during NGE. This presentation contained an initial 4 s with a text describing its location. This followed with 5 s asking participants to indicate the location (‘goal L/R?’) with regard to the current heading direction in Navigation condition and asking participants to indicate whether one can buy drink from that location in Control condition. DP occurred a few seconds before each junction. In the Navigation condition participants were presented with the option to turn at the junction ahead or go straight (for example, turn L/R?), while in the Control condition they were asked to press the button corresponding to the optimal path (for example, press left button). The amount of time between DP and the onset of the following turn or junction crossing (Street Entry Events) were temporally jittered to last between 3 and 9 s to allow separate measures of the BOLD signal at these two events. After each turn at the beginning of each new street section text appeared onscreen for 3 s describing the current location and general cardinal heading direction (for example, Broadwick St facing east). For some of the Street Entry Events (46.15 or 51.85% depending on the combination of routes), the route was suboptimal for reaching the current goal and participants were thus forced to take a detour to the goal (Detours). The mean duration of the routes was 266.60 s (s.d.=43.63, range=198–325). Routes were presented at walking speed (mean=1.6 m s−1, s.d.=0.41). Ten routes and task (Navigation/Control) were counterbalanced across participants. For further details refer to ref. 18.

Immediately post scan, participants took part in debriefing session outside the scanner in a testing room. Participants were not warned in advance that this would occur. In this debriefing session participants re-watched the five Navigation routes they had experienced on a laptop (12 inch screen) in a similar manner to ref. 40. At each of the events (NGE, DP, Street Entry) the film was paused and participants were asked to describe whether they remembered planning or thinking about their future route.

Graph theory analysis

A set of formal analytic measures of the environmental layout, based on graph-theoretic measures used in the field of space syntax, were used. These measures examine different properties of centrality in the street network. Space syntax methods relate human behaviour to the layout of the environment15,16. These methods provide a formal way of analysing the spatial properties of an environment, and can be applied to both indoor and outdoor spaces. A number of different methods fall under the term ‘space syntax’, and can be applied at different scales (for example, local/global). For space syntax analyses relating to the street network, the street network is represented as a graph. Graph-theoretic approaches have been adopted by a number of built environment disciplines as a way of analysing the relationship between spaces42. There are two ways of translating information in the built environment into a graph, resulting in primal or dual graphs. The appropriate type of graph must be matched with the type of analysis. Primal graphs are concerned with information at street intersections: street junctions are the nodes in the graph, and streets as the links between the nodes. This results in a graph that closely matches the geographic urban layout. Dual graphs focus on the streets themselves (as opposed to street junctions). This type of graph is relevant for street network analysis: street segments are the nodes in the graph, and the connections between street segments are the links between the nodes. Dual graphs highlight the topological properties of the network and tend not to resemble the map of the physical location. Space syntax analysis is based on a dual graph representation of the street network, also known as a dual network. A number of different graph-theoretic measures can be applied to such a graph to examine properties of centrality. Typically, three graph-theoretic measures of centrality are used: degree centrality, closeness centrality and betweenness centrality. Figure 1 provides an illustration of how these three measures capture different properties of an example street network. In the below, ‘segments’ refer to the units of street sections that form the dual graph.

Degree centrality measures the total number of edges connected to any node. Applied to the urban network, degree centrality is the number of connecting street segments to any street segment.

Closeness centrality is defined in ref. 17 as:

where d ik is the length of a geodesic (shortest path) between node p i and p k . Applied to an urban grid, closeness centrality is the reciprocal of the sum of the topological distance from that segment to all other segments. It reflects how likely it is that a segment is an origin or destination segment.

Betweenness centrality is the number of shortest paths from all segments to all other segments that pass through that segment. It is based on the measure defined in ref. 43:

where is the number of geodesics between node p j and p k which contain node p i and g jk the number of all geodesics between p j and p k . It reflects the likelihood that a segment is an intervening space in between an origin and a destination.

Space syntax analyses, based on graph-theoretic measures of the street network, have linked pedestrian movement to the topological properties of the street network. This has been shown for both for aggregate pedestrian movement44,45 and for the navigational decisions made by individuals19. The methodology is robust when compared to observed pedestrian flows across locations, scales, cities and cultures16. The approach is based solely on an analysis of the topological properties of the street network; no other information is included in the model. It has been suggested that part of the reason why space syntax analyses are so successful is that these types of analyses pick up on elements that are naturally processed during cognition46. It would seem that people intuit how connected a particular street is within the street network as a whole46,47.

On the basis of past research we considered that properties of the environment14,15,16,17,20 or the distance to the goal2 might correlate with our centrality measures or the change in centrality. Such factors might in themselves drive hippocampal activity at Street Entry Events. Thus, we measured a number of properties of the streets and the distance to the goal, and examined them in relation to our fMRI analysis—these measures are outlined below.

The following measures are recorded directly from the first-person videos. They reflect what can actually be seen from a certain point in the videos, as opposed to what could theoretically be the case. Obstacles and obstructions present in the videos are taken into account, so that the parameters reflect the information available to participants at a given point in the video.

Number of visible connecting streets. This is the actual number of visible path options from a given location. In contrast to the degree centrality measure, which records the number of connecting streets irrespective of whether they can be seen or not, this measure records what can actually be seen. It is similar to the visible connectivity measure used in Emo48.

Number of visible junctions. This is the number of junctions visible from a given location. In contrast to number of visible connecting streets, this measure records the number of junctions in sight regardless of type of how many streets at each junction.

Line of sight. This is the longest line of sight measured in real-world meters from a given location. The line of sight, measured at eye height, is translated into a line on Ordnance Survey map of Soho. It is irrespective of the choice of route (if available). Many studies in the spatial cognition literature suggest that depth of view is critical for navigation34,49,50.

Street width. This is the actual street width of the given location, measured in real-world meters. The location is translated onto the Ordnance Survey map of Soho.

Presence of shops/people/vehicles. This records the presence or absence of shops/people/vehicles from a given location. The presence of shops, people and vehicles are cues that convey how busy a street is. They are attractors in that a busy street is likely to have more of each. Research suggests that these elements are related to centrality measures of streets44,51,52, and that people detect such cues during navigation19,53.

Step depth to Goal. This is the optimal number of street segments required to reach the goal, starting from the current street and irrespective of the route taken in the video. For example, a destination on an adjacent street has a step depth of 1, as it is one street away. A topological step is counted at each junction, so that a destination lying exactly on the other side of a junction, but straight ahead, still has a step depth of 1.

Step depth to Boundary. Similar to the ‘step depth to goal’ parameter, this is the optimal number of street segments required to reach the boundary of the study, starting from the current street and irrespective of the route taken in the video.

For analyses examining the relation between these parameters see Supplementary Tables 5–8.

fMRI acquisition and analysis

Participants were scanned at the Birkbeck-UCL Centre for Neuroimaging (BUCNI) using a 1.5 Tesla Siemens Avanto MRI scanner (Siemens Medical Systems, Erlangen, Germany), with a 32-channel head coil. Functional scans were acquired using a gradient-echo echoplanar imaging sequence (repetition time (TR)=2.897 ms, echo time (TE)=50 ms, flip angle=90°, field of view (FoV)=192 mm2). In each volume 34 oblique axial slices, approximately perpendicular to the hippocampus (64 × 64 × 34 matrix size) and 3 mm thick, were acquired (3 × 3 × 3 mm voxel size). Following this a high-resolution T1 structural scan was acquired (MPRAGE, 176 slices, 1 × 1 × 1 mm resolution). The first six functional volumes of each session (dummy scans) were discarded to permit T1 equilibrium. Statistical parametric mapping (SPM12, Wellcome Trust Centre for Neuroimaging, London, UK) was used for spatial preprocessing and subsequent analyses. Images were spatially realigned to the first volume of the first session to correct for motion artefacts, co-registered with the structural scan, normalized to a standard EPI template in Montreal Neurological Institute space and spatially smoothed with an isotropic 8 mm full-width at half-maximum Gaussian kernel filter. After preprocessing, the smoothed, normalized functional imaging data were entered into a voxel-wise subject-specific GLM (that is, the first-level design matrix). The regressors of interest and six subject-specific movement parameters (included as regressors of no interest) derived from the realignment phase of preprocessing were included in all the models. The effects of interest are shown in Table 1. The periods of fixation between blocks were not modelled and treated as the implicit baseline. Each of the regressors of interest was then convolved with the canonical haemodynamic response function, and a high pass filter with a cutoff of 128 s was used to remove low-frequency drifts. Temporal autocorrelation was modelled using an AR(1) process. For effects with duration zero we took the standard approach of modelling events used in SPM12. This stick function is then convolved with the haemodynamic response function54.

Table 1 Events/epochs of interest and their duration. Full size table

At the first level, linear-weighted contrasts were used to identify effects of interest, providing contrast images for group effects analysed at the second (random-effects) level. In a series of GLM analyses we probed the fMRI data with the parameters of interest and covariates, Table 2. Initially we examined degree centrality because it is the simplest topological measure and has been highlighted as important for path planning37. We examined the parametric modulation of degree centrality at Street Entry Events and the categorical change in this parameter (Δdegree centrality) as the degree centrality (models 1 and 2). We examined the categorical change (1,0 or −1) in degree centrality because the range of variation in the change was highly limited. To establish that the observed response was unique to degree centrality, we probed the fMRI data with our measures of closeness centrality and betweenness centrality (models 3 and 4). To further separate the effects of these three parameters, we examined a model that included categorical value of change of all these parameters (model 7). Finally, to determine the specificity of the response in separate models we investigated correlation of [Δdegree centrality] and one of the following covariate parameters of no interest [ΔVisible Junction], [ΔVisible Connecting Street], [Δpath distance], [ΔEuclidean distance to goal], [Δstep depth to goal], [Δstep depth to boundary], [Δlight of sight], [Δstreet width], [Δstreet length], [Δpresence of visible people], [Δpresence of visible vehicles] and [Δpresence of visible shops]. We tested whether it is possible to construct a single model including all the mentioned parameters, which was not possible, as the model could not be estimated in SPM. To investigate specificity of the correlation of [Δdegree centrality] with activity of the right posterior hippocampus to Street Entry Events, we conducted an analysis where this parameter was also modelled at Travel Period Events and Decision Points. For Decision Points we examined both the change in centrality from the prior segment to the current segment the Decision Point was located in and the change that would occur after the outcome of the Decision Point (future segment—current segment).

Table 2 General linear models reported in this article. Full size table

Parametric regressors were not serially orthogonalized, thus allowing each regressor to account independently for the response at each voxel55. Each GLM explored the first-order parametric modulation of the events of that type, for both Navigation and Control routes. All models contained all the key events (see Table 1), plus Navigation and Control task blocks.

We focused our analysis on the right hemisphere because the right medial temporal lobe has been more consistently associated with spatial memory in humans (see, for example, refs 25, 27, 28, 56, 57, 58). Thus, we created a ROI in the right hippocampus using the SPM Anatomy toolbox (Forschungszentrum Jülich GmbH). Statistical analyses of the mean responses in the ROI were conducted in SPSS. For SPM analysis we used the ROI for a small volume correction applying a threshold of P<0.05 FWE. For follow-up analyses that involved a reduced numbers of events, such as when examining the subset of Street Entry Events that were Detours, we used a threshold of P<0.01 uncorrected within the ROI. For completeness, we report all brain regions at a threshold of P<0.001 uncorrected (or P<0.005 for medial temporal lobe regions) and minimum of five contiguous voxels for the planned contrasts as we have done in prior work18,59.

To further characterize the response post hoc we sectioned hippocampus into anterior and posterior ROIs. We used the MarsBaR SPM toolbox (v0.43, marsbar.sourceforge.net) to extract BOLD mean responses in the posterior and anterior hippocampal sections60.

To analyse a measure of how search might occur as opposed to just detecting the future possibilities, we calculated the demands in a BFS in graph theory, which is a method for searching a graph13. We ran two levels of search with (1) sum of the degree centrality measures of all street segments connecting to the next immediate junction (see Fig. 5a) and (2) the combined sum of the degree centrality measures of all street segments connecting to the next immediate junction and the sum of degree centrality measures for all street segments connecting to the subsequent junctions on the optimal path to the goal. BFS assumes calculation based on the degree centrality; however, we considered whether the search demands might change if calculated with closeness or betweenness centrality. We found that BFS demand measures using degree centrality, closeness centrality or betweenness centrality were highly correlated (r>0.8), and resulted in nearly identical SPM results to those from BFS using degree centrality. To test whether lateral PFC was involved in this search we created a lateral frontal ROI that encompassed the regions predicted in our recent review10 using the bilateral inferior and mid-lateral frontal ROIs from the WFU_PickAtlas61.

Behavioural study

Because the hippocampal response to the degree centrality was absent in Control routes, we reasoned that hippocampal response observed in Navigation routes might be related to retrieving information about the environment in order to aid navigation of future paths. If this is true then people who have knowledge of the streets should be able to determine whether degree centrality increases or decreases at Street Entry Events, and likewise those with no prior knowledge should be unable to determine whether it has increased or decreased. To test this two experts with extensive knowledge of the environment from the training protocol, and a group of 11 naive participants (six male participants, age range 20–28 years) who reported minimal or no prior experience were tested on their ability to judge whether degree centrality increased, decreased or did not change at each Street Entry Event in the fMRI study. Participants viewed the 10 routes tested in our fMRI task, and also two of the training routes to train them on our behavioural task. At each Street Entry Event participants were asked to press one of three buttons to make the judgement. Participants were told that ‘a street segment is the part of a street between any junctions; for example, Oxford street in London is one long street, but is made up of many segments’. The two training routes were used to familiarize them with the idea and confirm that they understood the task. Participants’ responses were recorded and marked based on the correct topological values to create a performance value for each participant. We estimated that because participants who were uncertain would be likely to opt for a ‘no change’ response, and because ‘no change’ was more often correct in the street network (59% of events), participants who were uncertain would potentially perform above chance, despite no knowledge. Thus, we examined the responses of participants for only those events during which degree centrality increased or decreased. Finally, to determine whether the right posterior hippocampal response was still correlated with the change in degree centrality in this subset of events we examined a GLM in which only events in which degree centrality increased or decreased were included.

Data availability

All the material will be available on request from the corresponding author.