Clapping Music is a minimalist work by Steve Reich based on twelve phased variations of a rhythmic pattern. It has been reimagined as a game-based mobile application, designed with a dual purpose. First, to introduce new audiences to the Minimalist genre through interaction with the piece presented as an engaging game. Second, to use large-scale data collection within the app to address research questions about the factors determining rhythm production performance. The twelve patterns can be differentiated using existing theories of rhythmic complexity. Using performance indicators from the game such as tap accuracy we can determine which patterns players found most challenging and so assess hypotheses from theoretical models with empirical evidence. The app has been downloaded over 140,000 times since the launch in July 2015, and over 46 million rows of gameplay data have been collected, requiring a big data approach to analysis. The results shed light on the rhythmic factors contributing to performance difficulty and show that the effect of making a transition from one pattern to the next is as significant, in terms of pattern difficulty, as the inherent complexity of the pattern itself. Challenges that arose in applying this novel approach are discussed.

Funding: This research was supported by The Digital R&D Fund for the Arts ( https://www.nesta.org.uk/archive-pages/steve-reichs-clapping-music/ ), a partnership between Nesta, the Arts Council England and the Arts and Humanities Research Council (AHRC) awarded to MP in partnership with The London Sinfonietta and Touchpress. Further data collection, outreach work in schools and analysis was funded by an award from the Engineering and Physical Sciences Research Council (EPSRC) Platform Grant EP/K009559/1 ( http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K009559/1 ; PI: Mark Sandler; MP is a co-investigator), held at Queen Mary University of London (QMUL). This research utilised QMUL’s MidPlus computational facilities, supported by QMUL Research-IT and funded by EPSRC grant EP/K000128/1 ( http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K000128/1 ).

Introduction

Rhythm can be perceived and reproduced, both in music and in our wider environment. We are adept at recognising regular rhythmic patterns, for example the sound of train carriage wheels passing over track joins, a heartbeat, a tap dripping or a clock ticking. We synchronise with rhythms (a phenomenon known as entrainment) consciously and unconsciously, for example tapping our feet as we listen to music. We respond to the regularities that are present in our auditory environment, inferring hierarchical patterns of regularly accented pulses, known as metre [1]. Some pulses are perceived as more accented than others and perception of such accents can depend on the structure of the rhythm, the arrangement of the notes and rests, or inferred from the volume, timing, articulation, intonation and timbre of a note. Accents can be produced expressively, the performers communicating their feel for the metre through the way that they articulate each note. They can also arise in mechanically reproduced rhythms, a simple example being a metronome in which a different sound is used to distinguish the pulse at the beginning of each repeated bar. Some rhythms may be perceived as simple, whilst others are perceived as more complex and are harder to entrain to or reproduce. For example metrical rhythmic patterns, with integer ratio relationships between pulse intervals and regular perceptual accents, are easier to reproduce than non-metrical patterns [2, 3].

Rhythmic coordination of perception and action has been studied in many contexts, and has particular relevance for musical performance, but research tends to focus on finger tapping in response to auditory stimuli rather than materials that approach the complexity of real music [4, 5]. Laboratory based experiments are valuable, but also limited by the number and range of participants that can take part. What if, instead of bringing participants to the laboratory, we made the study available to them in their environment? What if we made the test part of a compelling activity, embedded in real music, which encouraged repeated engagement? Whilst we may lose some of the experimental control available in the psychology laboratory, we gain in terms of the number of people who can take part, and the variety of participants in terms of musical training and sophistication, cultural background, geographic location, age and education, which are important sources of bias in much of the behavioural sciences [6].

This approach follows a newly developed methodology in the psychological and cognitive sciences that takes advantage of the ubiquity of mobile computers (smartphones and tablets) and online collection of large amounts of user data. Brown et al. [7], presented four classic experimental paradigms from the psychological literature as short games in a free app and found that the large sample size (20,800 users in one month) vastly outweighed the noise inherent in collecting data outside of a controlled laboratory setting. Griffiths [8] laments the predominant use of large databases of online behaviour for purely behavioural analysis, an example being recommendation of music based on shared listening patterns with other users (e.g., “people who listened to this artist also listened to this other artist”, also known as collaborative filtering [9]). He calls instead for a revolution that uses such databases to evaluate models of human cognition, which requires the use of theoretical cognitive principles as a bridge between:

lab studies that are small in scale and narrow in scope but rigorously controlled; and online studies that are large in scale and broad in scope, but noisy and uncontrolled, which can make it difficult to establish causality.

The present research is motivated to do exactly this, combining several cognitive theories of rhythmic complexity derived from previous lab-based studies, in order to investigate performance accuracy of a real piece of music, as performed by tens of thousands of people worldwide on their own mobile devices.

Specifically, we use a game-based application for iPhone and iPad based on the piece Clapping Music by Steve Reich to investigate the influence of rhythmic complexity on performance. Downloaded over 100,000 times in over ninety countries during a one year period, the app provides an opportunity to understand how people assimilate and reproduce twelve different rhythmic patterns in the context of a real piece of music, performed ‘in the wild’ so to speak. Having collected over 30GB of gameplay data at a detailed level, the analysis of the data becomes a big data challenge. Whilst there are problems associated with collecting and analysing such a large dataset, it presents a unique opportunity to understand which aspects of performing the piece Clapping Music present the most difficulty.

The paper is organised as follows. In the remainder of the Introduction, we introduce Clapping Music as a piece of music and the game included within the app, review previous relevant literature on rhythm and metre perception and consider in detail the implications of existing theories of rhythmic and metrical complexity for performance of Clapping Music. We then provide the Methods used to collect the data before presenting the results in five sections: first, descriptive properties of the dataset; second, examining evidence that users were motivated to complete the game; third, using tap accuracy as a measure of difficulty for the different rhythmic patterns in Clapping Music; fourth, examining the difficulty of transitions between patterns; and fifth, examining pattern difficulty with the transition effect removed. Finally, the results are discussed and limitations and future directions are presented.

Clapping Music: The game Steve Reich’s Clapping Music App (‘the App’) is a digital application including a game, videos and other content related to the music of Steve Reich and the music genre Minimalism, which is free to download from the iTunes Store for Apple devices running iOS 8 and above (from https://itunes.apple.com/app/id946487211). The App was developed through an interdisciplinary collaboration between The London Sinfonietta, a world-leading orchestra in the field of contemporary classical music, Touchpress, developers of apps for Apple iOS devices and the Music Cognition Lab at Queen Mary University of London (see [11]). There have been over 140,000 downloads worldwide since the launch on 9th July 2015. In the App, the device takes the part of the performer playing the static pattern and the player of the game takes the part of the performer making the pattern transitions. The game was designed so that you do not need musical training to play, for example representing quaver beats and rests through full or empty circles rather than musical notation. Rather than clapping, players tap in a performance area in the lower part of the screen. This was due to considerations of latency related to device microphones, and the need to isolate the clap from other sounds if the game was played in a noisy environment. Tapping also enables the game to be played using headphones without adding noise to the environment, maximising playing opportunities. Tapping, like clapping, is a discrete movement with tactile feedback but the absence of auditory feedback could make it more difficult to synchronise with the static pattern. A study investigating the role of movement in synchronisation to music found that participants were less able to synchronise with musical stimuli through bouncing than clapping, perhaps due to the absence of auditory and tactile feedback [12]. As a result, in the App each tap is represented audibly by a sampled clap sound so that the player can hear their tapping against the sound of the static pattern and get a feel for the ensemble performance. Clap sounds were sampled from the performance recording of David Hockings and Toby Kearney that appeared in the App (see Fig 2). McAdams et al. [13] describe a streaming effect, where two different patterns played simultaneously cannot be perceived separately, under certain conditions (see also [14]). The performance directions in the Clapping Music score indicate that this is a desirable effect: “Whichever timbre is chosen, both performers should try and get the same one so that their two parts will blend to produce one overall resulting pattern.” [15] However, player feedback from prototype testing of the App suggested that the game would be easier if there was some perceptual difference between player taps and the static pattern. The static pattern is created from one of David Hocking’s claps, the player’s tap is represented audibly by one of Toby Kearney’s claps. A player may audibly distinguish between the sound produced by their taps and the static pattern as the timbre is slightly different. The accuracy of tapping is determined algorithmically. The target rhythm is divided into time bins corresponding to intervals of a demisemiquaver (32nd note). Incoming performed taps are quantised to the nearest bin and then scored depending on temporal proximity to the nearest target bin. A tolerance can be set so that taps in bins either side of the target bin receive non-zero scores. The tolerance is greatest at the easy level of difficulty in the game but is reduced for the medium level, and reduced again for the hard level. Scores for individual taps are summed and normalised to yield a score for each completed pattern row (or loop) ranging between 0.00 (the lowest possible accuracy) and 1.00 (perfectly corresponds to the target pattern). If the player does not tap the correct number of claps in any given pattern, or does not clap for more than two beats, maximum error is recorded for that row. The player’s accuracy is represented in the performance area—a large green tap area is good (Fig 4c), a small red tap area represents an inaccurate tap (Fig 4d). To add a further gaming element, if tap accuracy meets a target threshold then the pattern moves down the screen (Fig 4a) and the player is able to transition to the next pattern (Fig 4b). If accuracy is too low or inconsistent, then the pattern moves up the screen, a red background indicating that the player is in trouble (Fig 4d). The dots representing note events turn white as they are due to be played, to help players get back on track. If the player’s accuracy improves the pattern will start to move back down the screen again. The game is over when the pattern reaches the top of the screen or the player has managed to play all 12 patterns, finishing with a final unison repetition of the original pattern. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 4. Playing the game in Steve Reich’s Clapping Music App. https://doi.org/10.1371/journal.pone.0205847.g004 The first pattern needs to be played accurately for a minimum of 8 repetitions. The first transition will not be offered until the pattern dots reach the bottom of the screen, as a result of consistently accurate tapping. As a result, players could tap more than 8 loops before making their first transition. Once a successful transition has been made, a minimum of 6 repetitions is required for all of the following patterns. Again the number of loops played per pattern will be more than this if accuracy is not maintained. The player receives a score determined by their accuracy for each row of the pattern completed, and the number of transitions they have made. There are three levels of difficulty in the game, which can be chosen by the player—easy, medium and hard, distinguished by tempo and the accuracy threshold. The accuracy thresholds for each level of difficulty were set using feedback from prototype testing. The easy level also has a metronome, which we will discuss later. There is a training area where any of the patterns can be practiced, and additional content including a full performance and an exclusive interview with Steve Reich, who has endorsed the app.

Rhythm and beat perception The representation of rhythmic structure, and how it is performed, influences how the listener might perceive it. It is useful to define some terms to describe the temporal organisation of music. While the word rhythm is used in a variety of ways, in its most specific form, a rhythm denotes a pattern of inter-onset intervals (IOIs) created between the temporal onsets of a sequence of events, which differ in the number of events and the temporal spacing between them. Each of the 12 patterns of the phasing part of Clapping Music constitute different rhythms. Under particular conditions, listeners may infer metrical structure (a metre), a hierarchy of stronger pulses, which are accented in some way when compared to weaker pulses, in a regular, repeating structure. If a pulse appears at a particular level, it also appears at the next, larger level, making metrical structure hierarchical [16]. So whilst pulse establishes a single periodicity, it is insufficient in itself to determine the accented groupings that are perceived as metre [1]. The most salient level in the metrical hierarchy is called the tactus or beat, the rate at which you might clap your hands, or tap your feet, as you listen. A higher level in the metrical structure is referred to as a measure or bar [17]. Time signatures are notational conventions used in Western music to indicate the metre intended by the composer as a direction to aid performance. The time signature 4/4 indicates that the pulse is a crotchet (or quarter note) and each bar contains 4 crotchets, which would normally be perceived as the tactus level. The time signature 12/8 indicates that the pulse is a quaver (or eighth note) and each bar contains 12 quavers (the tactus here would usually be felt every three quavers—a compound time signature). Tempo refers to the speed with which a rhythm is performed and is usually defined in terms of the rate of the tactus. Each level in the metrical hierarchy defines regularly recurring subdivisions of time, with stronger metrical accents coinciding with higher levels in the metrical hierarchy. An auditory stimulus (such as a piece of music) implies a metre to the extent that salient events appear on strong metrical accents in the metre. Events may be salient by virtue of their loudness, timbre, pitch or temporal relationship with other events. Once inferred, listeners are relatively resistant to changing their metrical interpretation [18]. Perception of metre depends on performance characteristics, and the musical training and cultural background of the listener. Not all listeners infer metre in the same way. For example, in studies where participants were asked to tap along to a rhythm some tapped at the beat level, others at the measure level or the two-measure level [19]. Furthermore, listeners from different musical cultures may infer metrical structure in different ways due to incidental exposure to the presence or absence of particular metres in the music of those cultures [20–22]. Rhythmic skill can be assessed in a number of ways, for example the ability of an individual to reproduce a rhythm from memory, to distinguish rhythms with subtle variations or to synchronise with a pulse by reproducing a perceived beat. Some research suggests that listeners have an internal clock, and can perceive accents in a pattern purely from temporal (time based) information. The more temporally regular a music excerpt is, the easier it should be for the listener to extract the underlying beat [23]. According to this approach, beat extraction should be easier for mechanical performances, for example audio stimuli synthesised by a computer, than for real performances with expressive variations. The ability to reproduce a pattern is strengthened by providing an external clock, such as a metronome. An alternative hypothesis, which is more applicable to expressive performances, is that the performer constructs a mental representation of the metrical structure, which they convey through the performance microstructure and expressive variation, making perception of an underlying beat easier. The perceived beat in this case is influenced by the performer’s interpretation of the rhythmic structure [4, 5]. This implies that listeners, depending on the expressive content, may perceive the rhythmic structure of different performances of the same notated music differently. A listener’s degree of musical training also influences the extent to which the expressive elements of performance of a rhythm are useful for beat perception. Drake et al. [19] compared musicians’ and non-musicians’ ability to tap in time with three variations of six excerpts of music; mechanically produced, mechanical with the first beat of the bar accented and an expressive performance. It was found that participants synchronized most successfully with the accented mechanical rhythm, followed by the mechanical version with no accent, and then the expressive version. In all three cases, musicians were better at synchronisation than non-musicians, but musical training did not change the extent of responsiveness to expressive compared to mechanical excerpts, which supports the existence of an internal clock. A study examining perception of rhythmic similarity between the patterns in Clapping Music reinforces this finding. It was found that expressive performance helped the non-musical participants more than the musically trained participants, perhaps due to the fact that musicians have experience in listening to, processing, and distinguishing rhythms as conceptual objects, rather than as purely auditory objects [24]. We will now examine how existing research on beat perception and rhythmic complexity can be applied to the design of the Clapping Music App and the analysis of the performance data that it generates.

Beat perception in Clapping Music The mechanical nature of the static pattern reproduced in the game means that it lacks expressive accents but players may still perceive a metre. The perceptual effects of a sound depend on the musical context in which that sound is embedded. A given sound’s perceived pitch, timbre, and loudness are influenced by the sounds that precede it, coincide with it, and even follow it in time [13]. These perceptual effects are not limited to expressive aspects such as relative beat volume or timbre, but can also be influenced by temporal effects such as the relative timing of note events in a measure. Cameron et al. [24] assessed the ability of individuals to perceive similarity between the 12 patterns in Clapping Music and found that when a rhythm with a large number of rests in common with the static pattern was heard after a rhythm with fewer rests in common, the pair was rated as less similar than when heard in the opposite order. This temporal effect could influence whether one particular transition to a new pattern is more difficult than another. Whilst the static rhythm is presented mechanically in the App, the player may reproduce their pattern with an element of expressivity, based on their temporal perception of the beat. However this will reduce their accuracy as assessed by the scoring algorithm. In the App, accuracy is prized over expressivity of performance. At the medium and hard levels, the pulse must be determined from the static pattern alone, as in a performance. However, feedback from early game prototypes suggested that some players found this too difficult. As a result, the easy level and the training area include a metronome—representing the accented mechanical scenario described by Drake et al. [19]. However no time signature is indicated in either Reich’s original handwritten score or the formal published version (9), and the patterns display metrical ambiguity [25]. So where would an accent be helpful for a musically inexperienced new player? For example, it could be every two note events (i.e. 6 beats per bar) denoting a 6/4 time signature, every four note events denoting 3/2 or every three note events denoting 12/8 (a triplet feel). Only one metronome rule could be implemented in the game. If the metronome reinforced individual perception of where accents might be, it may make some patterns easier to reproduce, whilst having an adverse effect on patterns where the metronome fits internal beat perception less well [23]. Using a variation of box notation, developed by Philip Harland [26], it can be seen that implementing the metronome differently could influence beat perception (Fig 5). With 6 beats per pattern (6/4) each of the 12 patterns has 4 note events accented by the metronome. With 3 beats per pattern (3/2), the number of note events accented varies between 1 and 3. The 6/4 metronome was implemented as it is more likely to provide consistent influence across the different patterns. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 5. Comparing a 6/4 (a) and 3/2 (b) pulse against each of the 12 Clapping Music patterns. https://doi.org/10.1371/journal.pone.0205847.g005