As is the case of frequency, the rank of words undergo changes year on year, reflecting their relative prevalence in usage (Cocho et al., 2015). Fig. 1a and c show the evolution of the rank as a function of time for two groups of semantically related words. The selected words occupy a wide range of rank positions. While, for instance, the word king fluctuates over relatively low ranks (corresponding to high frequencies), duchess occupies always a rank higher than 1,000. A similar situation is found for the food related terms, where the ranks of food and chicken differ from each other by some thousands. As for their time variations, although there is some correspondence in the positions of local maxima and minima within each group, correlations over longer time spans are generally weak.

Figure 1 Time evolution of ranks. (a, c) Variations of the logarithm of the rank over three centuries for two groups of related words. (b, d) Yearly differences in the logarithm of the rank of the same words, reflecting oscillatory variations in word prevalence. Full size image

However, as shown in Fig. 1b and d, a coherent oscillatory pattern is revealed when we look at the logarithmic rank variation ρ i (t), with different curves in each group following closely similar behaviour. All the words in each group show a remarkably consistent common pattern, in which they systematically increase their popularity over certain intervals and decline in the intervening years. In the following, we quantify this observation at levels ranging from individual words to semantically related groups.

Periods and phases of oscillations

To characterize the periodicity of the oscillations we first estimated the periods of the individual words by means of a wavelet analysis of their respective ρ i time series. Figure 2 shows two examples of the procedure we used to obtain the periods for every word in the core vocabulary. Briefly, we computed a scalogram and then obtained the set of local extrema. Each of these extrema can be associated to a pseudo-period, from which the histogram of periods is then computed (see Methods for details).

Figure 2 Example scalograms showing local extrema of wavelet coefficients. Colours towards yellows and blues indicate positive and negative values, respectively. Dots indicate the position of the local maxima and minima from which the period is estimated as explained in Methods. (a) Scalogram for king; (b) scalogram for food. Full size image

Figure 3 shows the resulting distribution of periods, for the whole time range (Fig. 3a), and discriminated per century (Fig. 3b to d). Fig. 3a shows a narrow peak at around 14 years with a small kink close to 50 years. In the figures corresponding to the individual centuries, it is apparent that oscillatory modes with longer periods increase in importance from century to century. In particular, the kink around 50 years is clearly a contribution of the 20th century. The effect can also be noticed in the individual time series of Fig. 1b and d as a tendency of the oscillations to slow down towards the present.

Figure 3 Period of oscillations. Histograms showing the distribution of oscillation periods obtained from the wavelet analysis explained in the text, (a) for all years since 1700; (b) for the 18th century; (c) for the 19th century; (d) for the 20th century. Full size image

In addition to the period of signals, another aspect related to the specific timing structure of the oscillations is given by their phase. While the data in Fig. 1b and d show that the two groups of words exhibit similar oscillation periods, the phases between them are different. For instance, towards the year 1900 the first group is changing downwards while the second group follows an opposite trend.

The study of phase relationships across the whole core vocabulary reveals two independent modulations affecting the phase of oscillations. Figure 4a shows the time evolution of all 5,360 nouns arranged in a matrix-like structure following a random order. Yellow and blue shades respectively indicate high (positive) and low (negative) values of ρ i (t). As put in evidence by the vertical strips of either yellowish or bluish tonalities, the presence of a global modulation in the phase, affecting all words more or less uniformly, is apparent. There are specific time ranges in which the words in the core vocabulary move preferably towards higher ranks, while in other intervals they tend to move down. These events signal major shifts in the overall lexicon: the fact that all nouns in the core vocabulary move towards higher ranks means that other nouns, which are not part of the core, become temporarily more important. It is striking that these events occur repeatedly with effects all across the core nouns. The curve on top of the matrix is the average of ρ i (t) over the core vocabulary, representing the mean modulation of variations in its usage.

Figure 4 Phase relationships. (a) Top: Average time evolution of ρ i for the 5,360 nouns in the core vocabulary. Bottom: Individual evolution of ρ i for the same words, arranged in random order. Yellow and blue shades indicate high (positive) and low (negative) values. (b) Left: Individual evolution of ρ i after word reordering according to the results of a hierarchical clustering algorithm. Right: Tree structure corresponding to the first few levels of clustering. Full size image

Clusters and networks of nouns

To analyse relationships in the time evolution that are more dependent on specific words, the mean modulation over the core vocabulary was subtracted from the time evolution of all words. The resulting time series were then grouped by means of a hierarchical clustering algorithm (see Methods for details) leading to a hierarchical tree structure, where closer topological distance across the dendrogram means greater similarity in the time series of the words. The first levels of the resulting tree are depicted in Fig. 4b together with the corresponding reordering of the time series. The most remarkable feature in the reordered dataset is the presence of numerous word groups that share specific phase relationship patterns over time. For many of the words, the time evolution is similar to the general trend observed in Fig. 4a, while others exhibit very different behaviours.

By inspecting the sequence of words in the ordering given by the clustering, it is apparent that most of the structure in the dataset is given by groups of semantically related nouns. Table 1 shows a few examples of contiguous groups extracted from the ordering produced by clustering (Fig. 4b). The clear semantic relationship between the words in each group emphasizes the close connection between meaning and changes of relative prevalence in the vocabulary.

Table 1 Semantic affinity of clustered nouns Full size table

While clustering highlights the existence of well defined groups of words with similar time behaviour—and, additionally, with close semantic relationships—the capture of more complex structure within each group requires characterizing detailed relations between word pairs. In Fig. 5a we show the correlation matrix for the time evolution of all the words in the core vocabulary (see Methods). The ordering of indices in the matrix is the same as for the result of clustering. As a consequence, most positive correlations (yellow shades) are distributed along the diagonal. However, the still very significant off-diagonal structure suggests that a description in terms of network topology would be more appropriate to represent the relationships between words. To extract a network from the correlation matrix we introduce a threshold θ. Two words i and j will be connected in the network is their correlation C(i, j) is larger than or equal to the threshold.

Figure 5 Noun network structure. (a) Correlation matrix for all words in the core vocabulary. The ordering of indices is the same as results from clustering (see Fig. 4b). Yellow and blue shades indicate high (positive) and low (negative) values. (b) Fraction of nodes in the largest network component as a function of the correlation threshold θ used to build the network. At the transition, around the critical value θ*=0.65, the network breaks down into a large number of small components. (c) Network degree distribution at the critical value of the threshold. The approximated power-law profile, with slope s≈−2, is a signature to a scale-free network. (d) Diagram of the largest network component at the critical threshold. Full size image

For a given value of θ the resulting network is generally not connected, but instead consists of a number of mutually disconnected components. Figure 5b shows the fraction of nodes (that is, words) in the largest component as a function of the threshold. As expected, for sufficiently small values of θ the largest component is comparable in size to the total network, containing a fraction of nodes equal or very close to one. As the threshold grows, however, there is a narrow range around θ≈0.65 where the fraction of nodes in the largest component drops rather abruptly, indicating that the network splits into a large number of small components. We verified that at the critical threshold θ*=0.65 the degree distribution of the network, shown in Fig. 5c, approximately follows a power law, suggesting scale-free structure (Barabási and Albert, 1999). Figure 5d shows a diagram of the largest connected component at θ*, comprising 2,670 nodes.

The largest component of the noun network also presents small-world features (Watts and Strogatz, 1998). At the critical threshold, its mean topological distance (diameter) is 6.77, while its mean clustering coefficient is 0.29. These values have to be compared with those obtained for a random graph. We have found that, in the noun network, the largest component has a diameter only 50% larger than the corresponding random graph while, on the other hand, the clustering coefficient is 100 times that of the random counterpart. These values indicate a small-world structure, with both locally strong connectivity and long-range connections joining distant parts of the network. Networks that share these features may have a structure where different parts of the network are naturally segregated into communities. Within each community, nodes are preferentially connected to other members of the same community, with only a few links directed towards other communities. To test this possibility on the noun network, we extracted its communities using a modularity optimization algorithm (Clauset et al., 2004; Newman, 2006) (see Methods) as implemented in Mathematica (Wolfram Research, 2016).

Figures 6 and 7 show two examples of the communities obtained from the noun network. To reveal more structure within each community, we proceeded to further divide them into sub-communities, by simply iterating once more the same algorithm. The community shown in Fig. 6 exhibits a strong thematic link, consisting almost exclusively of nouns relating to Ancient Rome. Interestingly, the further subdivision into sub-communities shows another layer of structure implicit in the correlations between words. Particularly, the sub-community in which nodes are represented by red disks has a clear pre-Imperial flavour, while the sub-network that consists of yellow disks is strongly Imperial. As the panels including the time series for the members of those two sub-communities show, oscillations have distinct temporal features, similar for words within the same sub-community and different across sub-communities. The community shown in Fig. 7, in turn, is thematically linked with astronomy. As noted above, further subdivision into sub-communities clearly shows that time evolution is strongly correlated for words with strong semantic relationship.

Figure 6 Example of a community in the noun network. In this case, it contains words related to Ancient Rome. The two largest sub-communities (sc1 and sc2) are readily associated with different Roman historical periods. The lower panels show the evolution of ρ i for words in different sub-communities. Full size image

Figure 7 Example of a community in the noun network. As in Fig. 6, for words related to astronomy. Full size image

The focus of our analysis has been on the noun class of words, since it represents the most semantically informative group of words. However, we have also checked that similar oscillations are found among verbs, albeit with typically smaller amplitudes compared to those found for nouns. Figure 8 shows histograms obtained from pooling all the values of ρ i (t) for all nouns and verbs in each dataset.

Figure 8 Distribution of instantaneous amplitudes for the pooled word time-series. (a) Nouns; (b) verbs. Full size image

We verified that oscillatory patterns similar to those found for English are also observed for nouns in French, German, Italian, Russian and Spanish. For these languages, however, the analysis was limited to the last two centuries because of their scarce representation in the Google database during the 1700s. The respective distributions of oscillation periods, in particular, are strikingly coincident with each other (see Supplementary Information).