A semi-controlled study in real traffic with a mixed design was used in the city of Linköping, Sweden. Each participant rode the same route twice, once while listening to music and once without music. The route was divided into a total of six segments, three per lap, and on one of these segments on each lap, the participant was asked to think aloud about his or her distribution of attention. Each participant also received three text messages along the route which they were instructed to deal with as they normally handle incoming text messages while cycling.

Participants

Participants were recruited via an on-line questionnaire, which was advertised in social media like Facebook, and through local bike clubs, flyers and e-mails to companies in the area. In total, 533 persons stated that they were willing to participate in the study, and 342 persons fulfilled the inclusion criteria: > 18 years old, experienced with cycling in the city centre of Linköping, willing and able to cycle for 6 km, provide own bike and smartphone, used to using the phone in traffic, and normal eyesight or eyesight that could be corrected with contact lenses or with extra dioptric lenses within ± 4 dioptres, which was a requirement to be able to use the eye tracking system. Both normal bikes and e-bikes were allowed. Participants were recruited for the study primarily with the aim to achieve an equal number of cyclists self-evaluating their cycling speed as slower than others, equal to others and faster than others. The first cyclists fulfilling the criteria were contacted for participation in the study. When asked about familiarity with cycling in the city centre of Linköping, about one-third (32%) of the participants cycled there daily, while one-third (27%) cycled 2–4 times a week and one-third (32%) at least twice a month. 10% cycled more seldom in the city centre. Results regarding cyclists’ speed and delays at specific traffic sites for different self-evaluated speeds are reported by Kircher et al. (2018). The study was approved by the regional ethical review board in Linköping (Dnr 2016/174-31).

Design and procedure

The route chosen for cycling was situated in the city centre of Linköping and was 3 km long. The traffic environment consisted of cycle tracks, mixed traffic as well as pedestrian streets without motorized traffic. Segment 1 (A–B in Fig. 1) consisted of a separated cycle lane ending with a round-about with mixed traffic. The speed limit was 30 km/h turning to 40 km/h at the intersection half-way through the segment. Segment 2 (B–C) was to a large extent comprised of mixed traffic, including a round-about and a stop sign, turning into a cycle path and ending with a road stretch within a pedestrian zone. Except for the round-about, which had a prescribed speed limit of 40 km/h, the speed limit was 30 km/h throughout this segment. Segment 3 (C–A) started with a cycle path that turned into a cycle and pedestrian path, followed by a road stretch with mixed traffic, a short cycle and pedestrian path, and ending with a road stretch with mixed traffic. The speed limit in segment 3 was 30 km/h, but on the first mixed traffic part a lower speed of 20 km/h was recommended by signage.

Fig. 1 Map source: OpenStreetMap contributors (2017) Segment 1 cycled was from A to B, segment 2 was from B to C, and segment 3 was from C to A. “Candy day” is a well-known concept in Sweden, with the aim that children should only eat candy once a week, on Saturday, for dental health reasons. Full size image

Each participant brought his or her bike and telephone to the starting point. While one researcher equipped the handlebars of the participant’s bike with two cameras (GoPro Hero 3, one forward facing and one facing the cyclist), another researcher informed the participant about the study, asked the participant to sign an informed consent form, and fitted and calibrated the eye tracker (SMI 2.0, SensoMotoric Instruments, Teltow, Germany). A third researcher, “the follower”, followed the participant during the experiment with a GoPro Hero4 camera facing forward on his bike recording the participant and the road environment. The follower greeted the participant and asked the participant to ignore being followed. The follower’s task was to ride behind the participant at a distance of 10–15 m, filming an overview of the situation, and to notify the participant, in case that he or she took a wrong turn.

Before setting off, the participant was informed whether or not to think aloud on the upcoming segment and whether or not to listen to music on the first or second lap. It was stressed that the participant moved in real traffic and that he or she should behave as they would on a typical ride, as the aim of the study was to capture the cyclists’ natural behaviour.

Think-aloud verbal protocols contain participants articulating their thoughts while performing a task, and it gives access to information currently in the participants’ short-term memory (Ericsson and Fox 2011; Ericsson and Simon 1980). On the think-aloud segments, the cyclists were asked to verbalize what they were thinking of, primarily in terms of where they directed their attention, what caught their attention and how they scanned the traffic and the surroundings. Since it is known that think aloud affects visual behaviour, with longer glances to objects that are currently being described (Hertzum et al. 2009), the participants were only required to think aloud on two out of six segments (balanced between participants).

In the music condition, the participants used headphones connected to their smartphone and could determine volume, tempo, how many earbuds they would use and what they would listen to, e.g. music, radio or podcast. Throughout this article, this will be referred to as listening to music, or as the music condition, even though other self-chosen media are included.

The participant then cycled along the route at his or her preferred speed, followed by the follower. At each end point of a segment, a researcher met up with the participant, asked whether anything special had occurred during the segment, asked the participant to fill in a NASA-RTLX (a subjective, multidimensional assessment tool that rates perceived workload in terms of mental demand, physical demand, time pressure, performance, effort and frustration, Hart and Staveland 1988) for the segment just cycled, explained the route for the next segment, and informed the participant whether to think aloud or not on the upcoming segment. When the participant returned to the starting point after two laps, the logging equipment was removed, and the follower asked the participants about two or three occurrences along the route. These were selected freely by the follower, based on what had happened while cycling (for example, cycling on the pavement). The answers to these questions are not investigated here (see instead Kircher et al. 2017b).

Each participant received three text messages when cycling. The text messages were sent by an experiment leader tracking the position of the cyclist via GPS. For practical reasons, the order and content of the messages was the same for all participants (see Fig. 1). The first message was sent on lap 1, segment 2, on a straight road stretch in mixed traffic along parked cars, just before arriving at an intersection with a stop sign. The second message was sent on lap 2, segment 1, on a separated cycle lane, before arriving to a larger intersection. The third message was sent on lap 2, segment 3, on a cycle and pedestrian path where no motorized traffic was allowed. The questions in the messages were chosen so that not too much effort should be required to answer them, neither mentally nor manually. The participants were not informed about where, when or how many text messages they would receive, only that they would receive text messages and that they were supposed to handle them as they normally would when cycling and receiving text messages.

Data reduction

For each participant, a GPS track, gaze direction, and videos of the forward view, of the cyclist’s face and from behind from the follower’s bike were logged. In addition, workload as measured by the NASA-RTLX was collected for each of the six segments, and answers to questions about incidents/events in traffic were noted. The data were annotated using the Observer XT 13.0 (Noldus Information Technology, Wageningen, the Netherlands), by manually marking gaze directions, complexity level and attentional demands. Each cyclist’s natural speed was calculated for road stretches without intersections, hills or slopes and where there was no other traffic. Due to poor GPS tracking accuracy in the city, this was done using cues in the traffic environment for start and stop of the distance, and the total time that the cyclist used cycling the distance.

The eye tracking data were manually encoded as glances towards the forward area, towards the phone and towards other gaze targets. Given the importance of peripheral vision in traffic, a breakdown into smaller target regions did not appear meaningful. Complexity levels, estimating roughly for how long it would be possible to close one’s eyes or to look away from the road without missing important information, were determined based on the videos, according to Table 1. For each instance where the cyclists interacted with their phone when cycling, the complexity level was coded in retrospect from the videos, continuously and subjectively by the authors. Note that the complexity rating scale has not been validated and that the threshold values are based on the authors’ previous experience.

Table 1 Categorization of complexity levels Full size table

The attentional requirements along the route were defined according to the minimum required attention theory (Kircher and Ahlstrom 2017). This was done by defining all static objects the participant had to attend to in order to be able to navigate the route safely. Examples include looking left and right in intersections, and looking at stop signs and traffic lights. In addition to such necessary requirements, we also encoded useful for own safety requirements, such as looking over the shoulder before crossing a street even though the cyclist had priority. Figure 2 gives an example of static MiRA requirements in an intersection. For all SMS events where the bicycle was moving, including their matched baselines (identical location but on the other lap), and for three intersections (one lap with and one without music), it was determined whether the MiRA requirements were fulfilled. The intersections were all four-legged intersections where the cyclist route continued straight through, but where different rules applied.

Fig. 2 Map source: OpenStreetMap contributors (2017) (color figure online) Example of static MiRA requirements in one intersection located on the experimental route. The intersection consists of two four-lane roads (white) flanked by cycle paths (grey) and pavements. The dark grey structures in the corners are buildings blocking the view from the cyclist’s perspective. The green area above the map corresponds to the road stretch within which the cyclist should attend to the respective requirements. Full size image

Intersection 1 The cyclist travelled on a separated cycle path on the approach, passed an intersecting cycle path and then reached the intersection where there was a traffic light for cars and a separate traffic light for bicycles (see also the sketch in Fig. 2).

Intersection 2 The cyclist travelled in mixed traffic and encountered a stop sign on the route, whereas the intersecting road was the main road.

Intersection 3 On approach, the cyclist travelled on a cycle lane in the roadway and had the right of way.

Three categories were used for deciding whether a MiRA requirement had been fulfilled: attended (based on the verbal protocol, eye tracking, or as assumed by behaviour), probably not attended, or impossible to decide. Impossible to decide could, for example, occur in cases where eye tracking was missing. In this study, only requirements related to static objects in the traffic scene were used. Ideally, also dynamically changing objects, such as surrounding road users, should be included amongst the requirements. However, the spatial resolution of the eye tracker, along with the difficulty of accounting for peripheral vision, prevented such analyses in the present study.

Analyses

Analyses of variance, using participant as a random factor, were used. The significance level was set to .05.

For text messages, the data from picking up the phone until putting it back again were analysed. In cases where a text message is read and answered with a delay in-between, the data in-between were excluded from the data material.