We study Western classical piano music from the so-called common practice period (circa 1700–1900 CE) chosen for the following advantages: High scientific and cultural significance, widely credited for having produced many fundamental musical styles that are influential today; A rich body of musicological understanding available from traditional research that could be compared with new, alternative approaches such as ours; And the abundance of high-quality data. The availability of large-scale musical databases and advances in scientific, analytical methods continue to enable novel and interesting findings on their properties [10]. Recent examples include researches on the topology and dynamics of the networks of musicians for the discovery of human and stylistic factors in the creation of music [15,16,17,18,19,20] and stylometric analyses of music that lead to corroborations or fresh challenges to established musicological understanding [21,22,23,24,25].

Using our framework we start by computing the level of novelty in musical compositions and composers, and study how they relate to the known characteristics of music at a given point in history. We then compute their influence on later times and how it can be used to characterize the evolution of compositional styles throughout history. The first step in using the formalism of Eqs. (3) and (4) is representing music as a set of elements, in other words modeling. Since modeling a system is an abstraction process that necessarily leaves out some real features of the system, it is ideal to retain the most sensible, relevant ones that also suit the modeler’s interests. For instance, for written works such as literature, scientific publications, etc., they could be words or groups of words such as the n-grams [7, 8, 26], and for paintings they could be colours and shapes [27, 28]. Here we model a musical composition as a temporally ordered set of simultaneously played nodes or codewords. For the actual element we take the codeword transition, the bigram (2-gram) of codewords. They are shown in Fig. 2(A) with the beginning of one of Chopin’s preludes as an example. While our methodology can be applied in a clear and straightforward manner to analysing musical compositions, we note that other aspects of music such as structure, tempo, instrumentation, etc. are also important in music. Our primary focus on codeword transitions here are based on the importance of harmony and melody in the Western classical music tradition [29] and the fact that for this paper we will be studying the piano, but for a more complete and useful modeling of music those elements will need to be incorporated in the future, and later we discuss some recent developments therein. We also note that our definition of a codeword retains all the original information on octaves and the keys in which the works were composed, resulting in a more complete and truthful representation than the one used in Ref. [22] where only the pitch class was considered (i.e. discarding the octave information; for instance, F4 and F5 were considered both F) and the keys were unified to the C scale.

Figure 2 (A) A musical score can be converted to a sequence of codeword (simultaneously-played notes) transitions (blue box). (B) The backbone of the network of codeword transitions in our data. Only 2267 out of 144,183 codewords (1.5%) are shown. The node radius indicates the number of transitions into and out of the corresponding codeword, while the edge width indicates the number of the corresponding transition. The node colour indicates the period when the corresponding codeword first appeared (blue-Baroque, green-Classical, yellow-Transition, red-Romantic). (C) The cumulative distribution of the occurrences of the codewords and the cumulative number of unique codeword transitions ever used (inset). The distribution exhibits a highly-skewed, power law-like behavior with power exponent \(\rho=2.13 \pm0.02\) established early in history (Fig. S1). Full size image

Our data set consists of MIDI (Musical Instrument Digital Interface) files collected from Kunst der Fuge (www.kunstderfuge.com) and Classical Piano MIDI (www.piano-midi.de) archives of 900 classical piano works by 19 prominent composers from the common practice period spanning the Baroque (c. 1700–1750), Classical (c. 1750–1820), Classical-To-Romantic Transition (c. 1800–1820), and Romantic (c. 1820–1910) periods, featuring Johann S. Bach and Georg F. Handel of the Baroque era, and Maurice Ravel of the late Romantic era. The composers and their works are in the Additional file 1 (SI Dataset 2). The MIDI files were converted into musicXML format via MuseScore2 software and chordified using Music21, a python library toolkit for computer-aided musicology [30]. The chordify method in Music21 converts a multiple-part complex musical score into a series of simultaneous notes as visualised in Fig. 2(A). Since each codeword transition is a directed dyad, they can be collectively visualised as a network whose backbone is shown in Fig. 2(B). The cumulative distribution of the number of occurrences of the codewords is shown in Fig. 2(C), and approximates a power law with exponent \(\rho=2.13 \pm0.02\), indicating significant disparities in popularity between codeword transitions. Although such a pattern is established early in history (Fig. S1), the number of unique codeword transitions ever used also constantly increases in time (inset of Fig. 2(C)), with the highest rate of increase observed during the Romantic period.

We now compute the novelty and influence of musical compositions. Writing a composition ζ as a sequence of codewords \(\zeta=\{\gamma _{1},\gamma _{2},\ldots, \gamma _{m}\}\) the generation probability of ζ is given by the first-order Markov chain

$$\begin{aligned} \varPi _{ \varOmega }(\zeta) &= \pi _{ \varOmega }(\gamma _{1}) \pi _{ \varOmega }(\gamma _{1} \to \gamma _{2})\cdots \pi _{ \varOmega }(\gamma _{m-1}\to \gamma _{m}), \end{aligned}$$ (5)

For \(\pi _{ \varOmega }\) we employ the Maximum A Priori (MAP) estimator [31] commonly used in Markov chains, given as

$$\begin{aligned} \pi _{ \varOmega }(\gamma _{i}\to \gamma _{j}) = \frac{z(\gamma _{i}\to \gamma _{j})+\alpha_{0}(\gamma _{i}\to \gamma _{j})}{\sum_{\gamma \in\varGamma} (z(\gamma _{i}\to \gamma )+\alpha_{0}(\gamma _{i}\to \gamma ) )}, \end{aligned}$$ (6)

where \(z(\gamma _{i}\to \gamma _{j})\) is the number of the \(\gamma _{i} \to \gamma _{j}\) transition in the conventional pool Ω and \(\alpha_{0}(\gamma _{i}\to \gamma _{j})\) is the prior representing the novel pool in our scheme. The form can also be viewed as a type of additive Laplace smoothing. When \(\alpha_{0}\) is a constant it is also called the uninformed prior, and interpreting the prior as the novel pool allows us to make a graphical representation in Fig. 1(C) with \(\alpha_{0}=1\) meaning the novel pool contains exactly one copy of each possible transition. Γ is the codeword space. The probability of the first codeword \(\pi _{ \varOmega }(\gamma _{1})\) is similarly \(\pi _{ \varOmega }(\gamma _{1}) = (z(\gamma _{1})+1)/(\sum(z(\gamma )+1))\), where \(z(\gamma _{1})\) is the number of occurrences of \(\gamma _{1}\) as the first codeword in Ω. Plugging this into Eq. (3), we obtain the novelty

$$\begin{aligned}

u(\zeta) &\equiv\frac{1}{m}\log\frac{1}{\varPi _{ \varOmega }( \zeta)} = \frac{1}{m} \Biggl[\log\frac{1}{\pi _{ \varOmega }(\gamma _{1})}+\sum _{k=1}^{m-1}\log\frac{1}{\pi _{ \varOmega }(\gamma _{k} \to \gamma _{k+1})} \Biggr]. \end{aligned}$$ (7)

Historical and psychological novelty

When computing the novelty of Eq. (7), we are free to choose Ω, the reference set of previous works that determine the conventional pool. A straightforward choice of Ω would be all known works that preceded ζ in history. This was aptly given the name historical novelty (H-novelty) in Artificial Intelligence (AI) research circles [1], and represents a given work’s novelty within the entire history of the field up to its creation. Another interesting choice of Ω contains all the previous works by the very creator of ζ. The resulting novelty is named psychological novelty (P-novelty) [1] that represents, for instance, the degree of improvement in a new version of an algorithm or a machine over its previous versions. Applied to our data it would show how a composer evolves in compositional style against his own past works [32].

We show in Figs. 3(A) and (B) the cumulative distributions of the H- and P-novelties of the piano works in our data for each period. Of the four, the Classical compositions tend to score low in both novelties, showing that many past conventions were reused both historically and psychologically (see Fig. S2 for the H- and P-novelty scores of the pieces over time). The novelties of the composers (given by the average of their works’) noted \(\mathrm {N}_{H}\) and \(\mathrm {N}_{P}\) are shown in Figs. 3(C) and (D). We note that the confidence in the high H-novelty of the Baroque composers should be low due to the much smaller conventional pool than other periods. The raised H-novelty of the Romantic composers, on the other hand, should be considered more impressive since it is achieved against the largest conventional pool. The high level of P-novelty shows Romantic composers having also actively introduced diverse and new codeword transitions throughout their careers. This is in clear agreement with the widely-accepted thesis that credits Romantic composers with having broken many accepted musical conventions and diligently conducting personal experimentation with new combinations of pitches [33]. The H- and P-novelties are generally positively correlated throughout, with the Spearman correlations equal to \(0.820 \pm0.013\) for the compositions and \(0.827 \pm0.113\) for the composers, respectively, meaning that pursuing novelty involved deviating from both the others and oneself (Fig. 4). The most notable outlier from this trend is Muzio Clementi (1752–1832) whose H-novelty is significantly lower than what his P-novelty would suggest, as shown in Fig. 4(B). This means that while he produced works distinct from his own earlier works (even more so than Handel, Mozart, and Haydn, and on par with Beethoven), they as a whole would sound conventional when compared with other composers’. This may be a quantitative corroboration of the explanation behind the common assessment of Clementi that in his time his reputation rivaled Haydn’s among his contemporaries, but languished for much of the 19th century and beyond [34]; The diversity of codeword transitions that he employed in his works (reflected in the high P-novelty) could have been the source of high reputation during his lifetime, but as time passed his works failed to distinguish themselves from others (reflected in the low H-novelty) and caused his loss in stature.

Figure 3 The H-novelty (left) and P-novelty (right) of the piano compositions (top) and the composers (bottom). (A) The cumulative distribution of the H-novelty scores \(

u _{H}\) of the works. The median and mean values are \((4.80, 4.78)\) for the Baroque, \((4.38, 4.40)\) for the Classical, \((4.73, 4.729)\) for the Transition, and \((4.82, 4.78)\) for the Romantic periods. (B) The cumulative distribution of the P-novelty scores \(

u _{P}\) of the works. The median and the means are \((4.90, 4.86)\) for the Baroque, \((4.69, 4.66)\) for the Classical, \((4.88, 4.87)\) for the Transition, and \((4.97, 4.94)\) for the Romantic periods. (C) & (D) The novelty \(\mathrm {N}_{H}\) and \(\mathrm {N}_{P}\) of the composers (defined as the mean of \(

u _{H}\) and \(

u _{P}\) of their works). A composer’s position on the x-axis (year) is the midpoint between his birth and death years. One should note that the conventional pool of elements is smaller for baroque composers, which could skew the H-novelty to look higher. However, the P-novelty does not suffer from limitation, and Bach and Scarlatti still have high P-novelty values. Full size image

Figure 4 (A) Scatter plot of the H- and P-novelty scores of the piano works, with a high level of correlation (Spearman correlation \(0.82 \pm0.01\)). (B) The H- and P-novelty scores of composers (Spearman correlation \(0.83 \pm0.11\)). A notable outlier is Muzio Clementi who shows a significantly small H-novelty given P-novelty. Full size image

Influence and shifts in dominant styles

While novel achievements are indispensable for the progress and growth of a creative enterprise, our results above suggest that novelty alone would not cause one to be considered ‘the greatest’: Beethoven, for instance, stand among the lower half in computed novelty. This is in line with many recent research findings that a creative work’s impact on its posterity does not depend solely on the degree of its novelty, and how it builds on tradition is also important [4, 7, 8]. Musical composition would be no exception: Past works exert influence on the future by serving not only as training material for new composers, but also by inspiring new works or lending themselves to be tweaked and transformed into new original works [1, 10]. Even mimicry or imitation, normally associated with subpar works lacking in originality and artistic value, can sometimes occur in renowned masters’ works and gain recognition: Franz Liszt, a leading Romantic-era composer, admired Beethoven so much that in a famous deed of homage he transcribed Beethoven’s complete symphony cycle into the piano [35] now considered a significant and influential achievement in its own right. These observations suggest that a sensible definition of ‘influence’ of a work would be the the degree to which it has been referenced by later works as in Eq. (4).

To compute \(\eta _{\omega }(\zeta)\) of Eq. (4), the influence of composer ω on ζ, we start by rewriting \(z(\gamma _{i}\to \gamma _{j})\), the number of \(\gamma _{i}\to \gamma _{j}\) transitions in Ω, in Eq. (6) as

$$\begin{aligned} z(\gamma _{i}\to \gamma _{j}) = z_{\omega }( \gamma _{i}\to \gamma _{j})+z_{ \overline{\omega }}( \gamma _{i}\to \gamma _{j}), \end{aligned}$$ (8)

where \(z_{ \omega }\) is the number of instances of the transition used by ω, and \(z_{\overline{\omega }}\) is that by all the other composers before ζ. Then \(\varPi _{ \varOmega }(\zeta)\) becomes

$$\begin{aligned} \varPi _{ \varOmega }(\zeta) &= \frac{ (z_{\omega }(\gamma _{1})+z_{ \overline{\omega }}(\gamma _{1})+1 )}{\sum_{\gamma \in\varGamma} (z(\gamma )+1 )} \times\frac{ (z_{\omega }(\gamma _{1}\to \gamma _{2})+z_{ \overline{\omega }}(\gamma _{1}\to \gamma _{2})+1 )}{\sum_{\gamma \in \varGamma} (z(\gamma _{1}\to \gamma )+1 )} \times\cdots. \end{aligned}$$ (9)

Eliminating all \(z_{\omega }\)s in the numerator, we obtain

$$\begin{aligned} \varPi_{\overline{\omega }}(\zeta) &= \frac{ (z_{ \overline{\omega }}(\gamma _{1})+1 )}{\sum_{\gamma \in\varGamma} (z(\gamma )+1 )} \times \frac{ (z_{\overline{\omega }}(\gamma _{1}\to \gamma _{2})+1 )}{\sum_{\gamma \in\varGamma} (z(\gamma _{1}\to \gamma )+1 )}\times\cdots . \end{aligned}$$ (10)

After computing the influences \(\{\eta\}\) between all 7298 eligible composer–composition pairs (self-influences were excluded) we plot each composer’s mean influence on the works created at any given time t (±10 years for smoother curves), shown in Fig. 5. During the Baroque period (B) Handel is the most influential, indicating that his codeword transitions were often also used at a later time by his contemporaries Bach and Scarlatti, whereas the opposite did not occur as frequently. More interesting patterns can be found when we observe the rise and fall of the composers’ influences over time. Since a high influence means that later works share common elements, we can interpret such rise and fall of composers’ influence as indicating shifts in compositional style, and providing a quantitative justification for the distinct period labels. Let us examine, as a start, the Baroque and the Classical periods in Figs. 5(B) and (C). While Handel maintains his dominant influence until around the mid-Classical period, we identify two notable patterns: First, Scarlatti overtakes Bach in influence shortly before the Classical period, in agreement with the well-acknowledged significance of Scarlatti on the Classical period [36]; Second, Haydn and Mozart emerge during the Classical period with a high influence, soon rivaling Handel’s. Similar dynamics–the clear rise and emergence of a new leading influential figure and therefore dominant ‘style’, reminiscent of Kuhn’s so-called paradigm shift [2]–are observed in subsequent periods. The Classical-to-Romantic transitional period (Fig. 5 D) is characterised by the emergence of Beethoven whose historical significance [37] is clearly shown. Beethoven’s high influence in this period shows his younger contemporaries adopting his codewords more willingly than any other predecessor’s (Figs. S6(C) and (D)) that continues well into the Romantic period. Also, from Eq. (4), we see that being referenced by a highly-novel composer leads to high influence, as high novelty means referencing uncommon elements, and so the one referenced is credited with more influence. This is likely why Beethoven, referenced by highly novel Romantic composers, has a high influence score. Then, through a similar mechanism, during the Romantic period new composers such as Schubert, Chopin, and Liszt rise in influence to rival or overtake Mozart and Beethoven (Fig. 5 E), befitting their reputation as of finally eclipsing those “classical sounds” and establishing many essential repertoire now permanently associated with the piano [37].