In recent work (Killin, 2017; see also Killin, 2016a) I have developed a theory of the evolution of early hominin musicality couched in a socio-cognitive niche construction framework: a picture that connects dynamic developments in hominin musicality, conceived as a mosaic of traits, to what I take to be the most persuasive interpretations of the evidence at hand within the context of understanding hominin evolution generally. I have hypothesised additional factors that are consistent and independently plausible. And I have defended aspects of my methodology and several explicit assumptions. My argument therein took a diachronic narrative form and this article picks up the narrative where Killin (2017) leaves off, at roughly 800,000 years ago (800 Kya) – the phase in human evolution I designate as the “Late Acheulean” (800–250 Kya).1 I argue that at least by 400 Kya (some) ancient hominins engaged in group activities worthy of the admittedly vague description, social “proto-music” (by which I mean not necessarily the direct progenitor of all current-day musics; rather, activities that exemplify some but not all of the distinctive features of music-making in ethnographically known forager societies). And I argue that from the social and cognitive capacities enabled, rehearsed and developed in proto-music, musical activities and traditions would incrementally evolve throughout modernity (typically considered to be from 250 Kya onwards, though one recent analysis places the earliest known modern humans as far back as 315 Kya; see Hublin et al., 2017), global dispersal from Africa (currently thought to be from 60-100 Kya onwards; see Fu et al., 2013; Rieux et al., 2014; Scally & Durbin, 2012), and throughout the Holocene (i.e., from approximately 12 Kya; Walker et al., 2009), enabling the emergence and subsequent cultural evolution of many musics of the world today.2 Although the “chronology” presents an order of events and sense of timing, it does not attempt a precise or exhaustive chronological causal explanation: there are many details still missing, many gaps in the material record, many aspects of human cognitive evolution up for debate. Nonetheless it presents a synthesis of research in progress, considers implications for theories of music origins, and sketches a tentative model of the evolution of music. The presentation of events by way of a narrative through time can sometimes give an impression of teleology: that “proto-music”, for example, was evolving towards music. Teleological thinking must be resisted in an evolutionary context, of course; the chronological structure of the article is simply a convenience (and a familiar format in which to couch a narrative-style account).

There are of course methodological challenges for any such research agenda. One is to overcome the well-known hurdle entailed by the fact that cognition and sociality do not fossilise – only indirect traces exist. Thus it is difficult to reconstruct the socio-cognitive lives of ancient hominins with any certainty. After all, traces erode with time, and there are serious challenges for the project of understanding the mechanisms underlying the cognition and sociality of living humans (and great apes in general), let alone our long-dead ancestors. Further compounding the issue – for this topic in particular – is the fact that a large portion of research on the evolution of music is couched within the adaptation/by-product debate (see Cross & Morley, 2009; Davies, 2012; van der Schyff & Schiavio, 2017) and this is an unhelpful framework for making progress on reconstructing music’s co-evolutionary trajectory (Killin, 2013, 2016a, 2016b, 2018a; see also Davies, in press; Tomlinson, 2015) even though (proto-)musical behaviours may well have been adaptive over the course of human evolution (Cross, 2003).

One critique of this literature is that it relies too much on armchair speculation. However, theorists can move beyond mere “just so” conjecture to “how probably” scenario building (Sterelny, 2018) by proposing and evaluating accounts that develop phylogenetically plausible evolutionary scenarios that are consistent and compatible with known lines of evidence, are cast in a general co-evolutionary/niche construction framework, and make constrained inferences from the archaeological, palaeoanthropological and ethnographic records. The result is still partially speculative of course: it is a defeasible evolutionary scenario. My goal, to be sure, is not to attempt to prove all aspects of the account outlined herein, but to make it at least plausible and attractive.

In the next section I dovetail the present article with Killin (2017) by recapitulating and expanding upon my discussion of “Late Acheulean” hominins, by which I mean modern sapiens’ hominin ancestors during the period of roughly 800 and 250 Kya – the period in which I envision the evolution of social proto-music taking place. In the third section I consider the long passage of behavioural modernity, discussing the archaeological record and the musics of ethnographically known foragers. I side with the view that our ancestors were musically active and had developed musical activities and technologies well before traces appear in the material record from around 40 Kya. In the fourth section I discuss music since the Holocene transition until the Common Era. The fifth section offers some concluding remarks.

Linking hypotheses of social expressive performance and hominin evolution with emotions, advances in technology, and a proto-aesthetic sensibility makes hypothesising about the emergence of proto-music intelligible. New research on the evolution of the emotions may well provide direct or indirect means for testing these ideas (see, for example, Peretz, 2011 , for a review of the neurobiology of musical emotions: evolutionary models are one direction for future research).

The social brain hypothesis and related research provides theorists with a framework for taking seriously these ideas. This is to entertain a perspective that is in contrast to the influential and widespread view of Richard Klein and others that “the dawn of human culture” occurred around 50 Kya, if we are to include music among the suite of cultural activities that are supposedly invented very recently by sapiens such as representational cave art and figurines, symbolic mortuary practices, and so on ( Klein & Edgar, 2002 ; Mellars, 1989 ). Adler’s (2009 ) discussion in Nature of a 40,000-year-old bird-bone flute has the provocative title, “The earliest musical tradition”. But the search for the origins and expansion of music begins not at merely 40 Kya with the onset of European flutes (pipes) in the Upper Palaeolithic, discussed in the next section. That’s what we would say if we thought “what you see is what there was”. Rather, it is more likely that behavioural modernity evolved gradually and incrementally ( d’Errico & Stringer, 2011 ; McBrearty, 2007 ; McBrearty & Brooks, 2000 ; Sterelny, 2011 ). In particular I have suggested that “proto-musical” behaviours are to be found in the socio-cultural and cognitive developments occurring, incrementally, within the “Late Acheulean” – around 400 Kya, and perhaps even earlier. This is based on my argument from hominin socio-cognitive co-evolution ( Killin, 2017 ), the upgrades in technological production, the plausibility of a proto-aesthetic sensitivity, and using the date associated with more common and continuous hearths as social magnets as a proxy. Avenues for further empirical research include focusing on whether evidence of persistent fire control corresponds with the presence of anything potentially of musical usage and looking to use-wear or experimental analysis to constrain the range of plausible inferences.

It is plausible that social proto-music, building upon the earlier developments in individual capacities for musicality ( Killin, 2017 ; lithic sound play, entrainment, motherese, call mimicry, vocal grooming, and so on), is a response to such a selection pressure, emerging and stabilising through cultural transmission and niche construction. Group life does not come without its stresses: coping with the close proximity of many individuals and the aggression (and other dramas) that will sometimes ensue is frustrating. And local resources are exhausted more quickly by bigger groups, so foragers’ ranges must increase, imposing extra time and energy demands. Yet a more socially complex life opens up further avenues for cooperation and coordinated activity allowing for greater returns from individual costs, if only the familiar problems of cooperation and coordination (e.g., freeriding) can be solved. The social brain hypothesis predicts that this occurred at least in part via some mechanism for strengthening social bonds and selecting for increased emotional complexity ( Gowlett et al., 2012 ). More tightly bonded communities are likely to be more cooperative. And despite increased group sizes, these were still small social worlds by modern Western standards; these were social worlds in which everyone (more or less) was acquainted with everyone else. At 400–500 Kya, our ancestors were big-brained; they almost certainly possessed a relatively advanced theory of mind, and an increasingly complex emotional suite. Social proto-music, presumably utilising the voice and the body, is a means of enhancing the emotional/affective expression of individuals and dynamics between individuals. Indeed, evolutionary accounts have rarely considered the role of the emotions. Following Gamble and collaborators, it is worth emphasising that voice/body proto-music (and perhaps dance) would have rehearsed emotional expression, socialisation, and cultural innovation – and need not have left material traces.

With predicted community sizes of up to 120, we should expect selection for mechanisms to amplify the emotional basis by which lasting social bonds were forged. One selection pressure for this is clear. With larger community sizes less time was spent together as dictated by fission and fusion to balance population to resources. ( Gamble et al., 2011 , p. 124)

As I will discuss shortly, archaeological evidence reveals music’s presence in the Upper Palaeolithic, but there is no direct material evidence of musicking during the “Late Acheulean” (although the over-large handaxes discussed above are suggestive of a general proto-aesthetic sensitivity and the ability to abstract). So we must lean on inference, suggestive circumstantial evidence, and theoretical frameworks. The social brain hypothesis 7 – of which the core idea is that social complexity was a key driver of hominin encephelisation ( Dunbar, 1998 ; Gamble, Gowlett, & Dunbar, 2011 ; Gowlett et al., 2012 ) – offers a framework through which some progress might be made:

According to Gowlett et al. (2012 ) hearths were common enough from around 400 Kya to suppose that a set of novel behaviours would take hold, associated with firelit socialising. By this time, ancient hominins were central place foragers, more organised/centralised around cooking hearths. And as big-brained hominins, it is very likely that they would become easily bored and restless, yet would have been intuitively creative, innovative and, importantly, social . So it is unsurprising that cultural activities would eventually arise that would have the effects of strengthening group identity, rehearsing coordinated action and theory of mind, and, importantly, channelling and shaping emotions. So here is where social proto-music comes in, building upon earlier foundations of hominin musicality ( Killin, 2017 ).

Importantly, then, hearths were “social magnets” ( Barham, 2013 ). Wiessner (2014 ) provides ethnographic examples. She points out that although flickering firelight extends the day, it does not extend the time in which foragers engage in utilitarian activities such as hunting, foraging, or tool-making. Rather, it extends the time available for social pursuits at a time that otherwise would not conflict with subsistence activities. For the Ju/'hoansi hunter-gatherers, firelit night talk and activities “steer away from tensions of the day to singing, dancing, religious ceremonies, and enthralling stories…Night talk plays an important role in evoking higher orders of theory of mind via the imagination” ( Wiessner, 2014 , p. 14027). Stories were frequently accompanied by background music (often performed on musical bows). Economic and functional concerns, as well as the personal gripes of individuals, are put aside as everyone gathered to make music, dance, or tell stories. These activities often closed social rifts and facilitated bonding.

We feel warm and friendly towards those with whom we eat. This might explain why we find social feeding so important…Social eating of this kind seems to be universally important across all cultures, yet no one has ever stopped to ask why we do this…The obvious answer is social bonding. ( Dunbar, 2014 , p. 195)

Gowlett, Gamble, and Dunbar (2012 ) point out that keeping a large hearth’s fire alive requires a lot of firewood: 50–100 kg per day. Presumably, gathering that timber would have been a coordinated, cooperative enterprise. All members of the band benefit from a campfire and all would have been drawn to it upon nightfall. And as Dunbar notes, social eating (such as sharing a satisfying meal around a campfire) triggers the release of endorphins:

Recent research by Wrangham (e.g., 2009 ) emphasises the importance of fire and cooking to hominin evolution. Fire enabled the cooking of meat and underground storage organs (e.g., bulbs, storage roots, tubers), as well as food preservation techniques such as smoking and drying of meat and fish, some combination of which provided ancient hominins with the energy required for larger, more expensive brains. Fire granted our ancestors more leisure time, by extending the period with usable light, and by lessening the time spent eating. (Chimpanzees spend hours chewing their food; cooking makes foods easily consumable and digestible.) Fire enabled the reduction of gut size, since guts did not have to work so hard to extract nutrients from food digested, allowing reallocation of energy into increasing encephalisation. Fire provided heat, protection, and light. It extended the time that could be spent communicating, socialising, planning hunts, and so on. It kept vermin at bay, it provided a means for charring the ends of wooden lances into useful, hardened pointed tips, and may have assisted plant-growth management.

From 790 Kya there is strong evidence of fire control ( Goren-Inbar et al., 2004 ). Wrangham (2009 ) and Wrangham and Carmody (2010 ) suggest that cooking/fire control is even earlier (see Attwell, Kovarovic, & Kendal, 2015 for review; Roebroeks & Villa, 2011 ), however, the archaeological signature of early fire use is patchy and in earlier stages may represent only partial (opportunistic or sporadic) fire control. We cannot assume that once harnessed, fire became an enduring feature of ancient life. As Roebroeks and Villa note, hearths and other evidence of fire control became much more archaeologically visible from around 400 Kya, from which point there is widespread, continual evidence of skilled control of fire. The received view is that fire was harnessed opportunistically at first and over time became habitual at least by 400 Kya.

Until 500 Kya all known tools were made from a single source material and were hand held. 6 But from this point on, handles/shafts were increasingly added to stone tools; these hafted tools were produced not merely by reducing and shaping raw material, but by adding distinct components together. No other animals do this, not even chimpanzees (termite wands, for instance, are simple single-source items). According to Barham (2013 ), hafted tools are further evidence of increases in forward planning, working memory, raw material manipulation, social learning and intentional teaching. These capacities are important for complex cumulative cultural evolution in general, and for the subsequent emergence and persistence of the full-fledged musics of today.

Increased encephalisation (which requires a high quality diet), social and cognitive complexity, and climatic stress 5 enabled hunting techniques and technologies to increase in complexity ( Barham, 2013 ). This includes, for example, the use and production of complex, elongated javelin-style spears, appearing alongside butchered horse bones at Schöningen, Germany around 400 Kya ( Thieme, 1997 ). Several lines of evidence indicate that the use of spears by H. heidelbergensis in South Africa at roughly 500 Kya is very likely ( Wilkins, Schoville, Brown, & Chazan, 2012 ). The use of projectile weapons such as spears, to my mind, implies that these ancient Homo had at least a basic understanding of ballistic principles (see also Zilhão, 2007 ). If an animal target is on the move, the future-projecting mind of an experienced hunter can predict the animal’s future position and compensate for its movement when aiming to throw or be ready to throw. Although this could possibly be honed associatively through long practice, I suspect the complexity of the task hints at advances in episodic cognition and the ability to be consciously aware of the past and of (probable) future states. Recalling past successes – the feeling of the grip, the angle and pressure of the throw, the balance of one’s stance, and so on – lends a higher probability to a successful throw in the present. These spears were well-crafted projectiles – tapered towards the back and heavier towards the front with the centre of gravity in the forward third of the shaft. Replicas demonstrated particularly effective penetrative power when thrown as far as even 15–20 m ( Churchill & Rhodes, 2009 ; Rieder, 2003 ). This is suggestive of some degree of division of labour and specialisation of skills at least in the production, if not also use, of these javelins. They certainly evince advances in craftsmanship and raw material manipulation.

As far as researchers can tell from the fossil evidence, at least by Homo heidelbergensis (roughly 600–800 Kya) – currently thought to be the predecessor of the Neanderthals, Denisovians and modern humans (see, e.g., Manzi, 2011 , 2012 ) – ancient hominins were capable of producing, more or less, the kinds of vocal sounds modern humans are capable of producing, and they had executive (top-down) control over many of their vocalisations ( Morley, 2013 ). Whether Neanderthals, also descendants of H. heidelbergensis , had anything like early H. sapiens ’ linguistic-vocal capacities is hotly debated ( Johansson, 2015 ) but it is likely that both species would have been anatomically capable of near-modern vocal musicality ( Mithen, 2005 ). 3 Although Neanderthals made personal ornaments and bone tools ( d’Errico et al., 2003 ), no uncontested evidence of Neanderthal musical technology (e.g., flutes) has been discovered. 4 Nonetheless, Mithen argues that Neanderthals may have been more musical than other researchers have supposed.

The Late Pleistocene: Mid/Upper Palaeolithic musicality

Here the narrative reaches the long stretch of human modernity (from 250 Kya onwards). It is within this phase that evidence for fully-fledged (“symbolic”) languages, long-distance trade networks of over 300 km,8 musicians and musical instruments, sculptures and cave painters, body painting and ornamentation, burials, grave goods, shamans/priests and religion all eventually appear in the material record, though not simultaneously, and not permanently from first appearance. Indeed, it is towards the end of this time period (i.e., from 40 Kya) that representational cave paintings and figurines/sculptures appear in the archaeological record (Lawson, 2012; Pike et al., 2012), including depictions of large animals and water birds, as well as part-animal, part-human creatures. The lion-headed man of Hohlenstein Stadel, the oldest known figurine, dates to around 40 Kya (Kind, Ebinger-Rist, Wolf, Beutelspacher, & Wehrberger, 2014). Venus figurines, the oldest known fully human representations, appear in the archaeological record from around 35 Kya (Conard, 2009). Nonetheless earlier traces of an aesthetic sensibility and of symbolism appear in Africa (McBrearty & Brooks, 2000). Early sapiens utilised ochre and other pigments, presumably for personal decorative effect and to colour artefacts, possibly as early as 230–280 Kya (McBrearty & Brooks, 2000) and almost certainly by 165 Kya (McBrearty & Stringer, 2007). Decorative adornments such as beads and other ornaments (including shells and animal teeth and bones) were not far behind, appearing from around 90 Kya – also disappearing, and reappearing in the archaeological record, becoming more common and continuous over time (Kuhn, 2014; Stiner, 2014; Zilhão, 2007). Blombos Cave in South Africa revealed engraved ochre artefacts dated at around 75 Kya (Henshilwood, d’Errico, & Watts, 2009). These very early markers of the artistic/aesthetic and the symbolic are quite unstable in the archaeological record. Thus it appears that behavioural modernity was an incrementally evolving, continuous process.9

In this section I paint a picture (with broad brush strokes) of the musical behaviours, capabilities and technologies of ancient sapiens. Both (2009) notes that researchers engaged in such a project have typically privileged just one of the archaeological and ethnographic records, understanding the other as “subordinate”. Yet both lines of evidence are valuable and not mutually exclusive. Indeed, since neither one alone can offer more than a partial picture of ancient music, prospects for integration are a priority for ongoing and future research. I turn first to prehistoric music archaeology, then to (historic and contemporary) hunter-gatherer ethnomusicology. Finally I reflect on some consequences of taking seriously signalling theory in theorising about music archaeology, an avenue for future research.