CORRELATIONS OF CONTINUOUS RANDOM DATA WITH MAJOR WORLD EVENTS R. D. Nelson,a D. I. Radin,b R. Shoup,c P. A. Banceld a Department of Mechanical and Aerospace Engineering Princeton University Princeton, New Jersey, 08544, USA, E-mail: rdnelson@princeton.edu b Institute of Noetic Sciences, Petaluma, California, 94952 c Boundary Institute, Los Altos, California, 94024 d 108, rue St Maur, Paris, France F-75011 Received 18 July 2002; revised 4 October 2002 The interaction of consciousness and physical systems is most often discussed in theoretical terms, usually with reference to the epistemo- logical and ontological challenges of quantum theory. Less well known is a growing literature reporting experiments that examine the mind-matter relationship empirically. Here we describe data from a global network of physical random number generators that shows unexpected structure apparently associated with major world events. Arbitrary samples from the continuous, four-year data archive meet rigorous criteria for randomness, but pre-speciﬁed samples corresponding to events of broad regional or global importance show signiﬁcant departures of distribution parameters from expectation. These deviations also correlate with a quantitative index of daily news intensity. Focused analyses of data recorded on September 11, 2001, show departures from random expectation in several statistics. Contextual analyses indicate that these cannot be attributed to identiﬁable physical interactions and may be attributable to some unidentiﬁed interaction associated with human consciousness. Key words: physical random systems, correlations in random data, quantum randomness, consciousness, global events, statistical anomalies,

Fig. 1. Locations of host sites for the RNG nodes. Distribution is opportunistic because the network depends on voluntary collaboration. 1. INTRODUCTION Quantum indeterminate electronic random number generators (RNG) are designed to produce random sequences with near maximal entropy [1, 2]. Yet under certain circumstances such devices have shown surprising departures from theoretical expectations. There are controversial but substantial claims that measurable deviations of statistical distribution parameters may be correlated, for unknown reasons, with conditions of importance to humans [3, 4]. To address this putative correlation in a rigorous way, a long-term, international collaboration was instituted to collect standardized data continuously from a globally distributed array of RNGs [5]. First deployed in August 1998, the network uses independent physical random sources designed for serial computer interfacing [6] and employs secure data-collection and networking software. Data from the remote devices are collected through the Internet and stored in an archival database. The geographic locations of the 50 host sites comprising the network as of late 2002 are shown in Fig. 1. The data archive, continuously updated, is freely accessible through the Internet and can be investigated for correlations with data from many disciplines: earth sciences, meteorology, astronomy, economics, and other metrics of natural or human activity. In addition to making the database available to the scientiﬁc community, the collaboration maintains ongoing experiments to test the conjecture that deviations in the random data may correlate in some way with human activity. The primary experiment measures deviations in the variance of the network output during brief, predesignated examination periods corresponding to collective human events of major importance. After nearly four years of operation, we

ﬁnd that, whereas the data overall meet standard criteria for randomness and device stability [6], the data corresponding to the speciﬁed periods tend to exhibit anomalous deviations from expectation. As of June 2002, the statistical signiﬁcance of the cumulative experiment, comprising over 100 replications of a protocol testing the conjectural hypothesis, attains a level of ﬁve standard deviations. This apparent correlation of the output of true random sources with a socially deﬁned variable remains unexplained. However, in view of the increasing signiﬁcance of this cumulative statistic, we feel it is appropriate at this point to present a description of methods and results and to invite comment and independent analysis. In this report we summarize the overall results and provide a detailed assessment of the data recorded on September 11, 2001, which constitute a particularly well-deﬁned case study. The results show that substantial deviations from chance expectation in several statistical parameters are present in the archive for this day. 2. METHOD The RNG devices employed in the network rely on quantum level processes as the source of random events. All but two of the units are based on quantum tunneling in solid-state junctions. Those not based on tunneling use classical Johnson noise in resistors. The devices are designed for professional research and pass an extensive array of standard tests for randomness, based on calibration samples of a million 200-bit trials. They are well shielded from electromagnetic (EM) ﬁelds, and all utilize stable components and sophisticated circuit designs to protect against environmental inﬂuences and component interaction or aging. In addition, the raw bit sequence is subjected to a logical exclusive-or (XOR) against a pattern of an equal number of 0 and 1 bits to guarantee an unbiased mean output. Data bits are collected by custom software on the local host into 200-bit trial sums at the rate of one trial per second. The trial sums theoretically distribute as a binomial (200, 1/2) distribution (mean = 100, variance = 50). The software records the trial sums into time-stamped ﬁles on the local disk, with computer clocks at most nodes synchronized to standard Internet timeservers. Data packets with identiﬁcation and timing information and a checksum are assembled and transmitted over the Internet to a server in Princeton, NJ, for archiving in daily ﬁles that contain the data for each node for each second, registered in coordinated universal time (UTC). A standardized analysis protocol is used to examine the random data during periods corresponding to global events such as the disaster on September 11, 2001 [7, 8]. Fully speciﬁed examination periods are entered into a prediction registry, accessible through the Internet

[9]. These periods comprise collectively about 1% of the full database. Roughly two-thirds of the registry entries pertain to scheduled events and are registered before the event occurs; all are entered before the archive is examined. Our analyses employ standard techniques of classical statistics, which can be found in introductory statistics texts [10]. The analysis proceeds by converting the raw RNG trial sums to a stan √ dard normal deviate, or z-score, as z =(trialsum − 100)/ 50. The unit of the z-score is the standard deviation of the normal distribution, σ = 1. A composite network Z-score representing the signed departure of the composite mean of all nodes for each second is computed as Z = z/√N, where N is the number of devices reporting for each second. From this, a network variance cumulative is computed as the sum of (Z2 − 1). Similarly, a device variance cumulative is computed as the sum of (z2 − 1) over all devices per second. Note that these quantities are deﬁned with respect to the theoretical trial sum variance of 50. Substituting an empirical trial sum variance yields essentially identical results. The standard protocol is to compare the variance of the network output, Z2, with its theoretical expectation for the designated periods. Large excursions of this measure reﬂect an excess of correlated deviation among the nodes if, for example, the independent RNG devices are subjected to a common inﬂuence that changes their output distributions. This analysis tests for accumulating positive excess of the network variance, and 93% of the entries in the prediction registry are of this sort. A smaller number of formal examinations address the device variance, which has become a more useful statistic as the number of online devices in the network has grown. It shows changes in the absolute magnitude of the deviations of individual devices from expectation. The Z2 measure and the device variance calculation are both additive chi-squared statistics [10], which permits the aggregation of results from all the individual cases into a single composite statistic. 3. RESULTS 3.1. Composite Distribution The aggregated outcome of the formal analyses for 109 registry entries as of June 2002 yields a statistic with a probability value on the order of 10−7, and a corresponding equivalent z-score of ﬁve normal deviations (5 σ). Importantly, this aggregate result is not due to outliers or a few extraordinary replications, as is demonstrated by translating the individual experimental results to their equivalent z-scores. The z-scores for the 109 experiments distribute smoothly, and are well approximated by a Gaussian ﬁt (mean = 0.53±0.1, standard deviation = 1.10). Truncating the distribution by successively removing the highest

and lowest scores yields mean values within one standard error of the full distribution value. The smoothness of the distribution implies that the aggregate result is due to a small but consistent accumulation of statistical weight under application of the experimental protocol. While the continuing experiment will allow further testing of the experimental hypothesis, the distributed character of the current result suggests ad hoc approaches to examining the database. In particular, if the conjectural hypothesis is valid, independent designations of global events other than the experimental prediction registry should also show correlations with the database. A preliminary analysisﬀ of this type is presented below. A further implication of a distributed e ect is that data segments immediately neighboring the registered examination periods may be expected to show statistical deviations. A full analysis is beyond the scope of this paper. However, as a concrete example of this approach, we present several post hoc analyses of the data for September 11, 2001, which stands out as one of the most important entries in the registry in terms of global impact socially and psychologically. Our strategy is to compare the data surrounding registered examination periods against appropriate control samples from the full database. Note that, while we provide probability estimates for the analyses as a guide to the reader, these do not imply explicit hypothesis tests. 3.2. Network Variance On September 11 the global network of 37 online RNGs displayed strong deviations in several statistics. The registered prediction for this event designated an examination period from 08:35 to 12:45 (EDT local time). The network variance cumulative for this period attains a modest p-value of 0.028 (equivalent z-score of 1.9 standard deviations). However, a trend exhibiting substantial excess in the network variance measure began early in the morning and continued for more than two days, until roughly noon on September 13. A cumulative deviation (cumdev) plot, such as is used in process control engineering to identify changes in monitored parameters, shows a notable departure from the expectation for this statistic, which is a horizontal random walk (Fig. 2). The trend beginning at the time of the World Trade Center attack and maximizing 51 hours later is statistically unlikely, as shown by iterative resampling analysis. Deviations with this slope and duration occur with probability 0.012 in random resampling from the 11 days of data shown in Fig. 2. This analysis underestimates the strength of the deviation, however, because of the substantial average bias of this 11-day segment. Relative to theoretical expectation, the corresponding estimate is p =0.002. Resampling from a ﬁducial 400day segment of the database from December 1 2000 to January 4 2002, during which the number of online devices is comparable to that on September 11, yields an estimate of p =0.003. Applied to a control database with algorithmically generated pseudo-random data [11], the

Fig. 2. Cumulative deviation of network variance for each second, September 5 through 15, 2001. Day boundaries are in Eastern Daylight Time (EDT). The terrorist attacks are marked with a pair of vertical dotted lines at 08:45 and 10:30 on September 11. A segment of a parabolic envelope of 5% probability with its origin at the level of the random walk at 08:45 (horizontal dotted line) provides scale for the persistent trend. cumulative Z2 analysis shows no unusual trend, ruling out possible artifacts in the analysis procedure itself. This extended positive excursion of the September 11 network variance suggests that the database may indeed contain correlations lying outside the explicitly speciﬁed examination periods. 3.3. Inter-Note Correlation To test more comprehensively for correlated deviations among the in- dependent nodes in the network, we have examined the average daily node-to-node correlations within the 400-day database. The correlations were calculated for data sequences with a similar time scale to that of the experimental examination periods, typically several hours in length. The squared z-score sequences for each device were low-pass ﬁltered and all possible (Pearson) correlations among RNG pairs were calculated for each day’s data (00:00 to 24:00, EDT) in the ﬁducial database. Each pair correlation was transformed to a Fisher Z-score [10] and a mean daily value for the correlations was calculated. These values give a measure of the average strength of correlations between node pairs, on the time-scale of the smoothing window, on any given

day in the 400-day period. The result shows that the daily mean for September 11 deviates markedly from the mean of the distribution for all days (t-score = 3.81, p =0.00009, ﬁlter window = 2 hours) and that this positive correlation is the largest occurring in the dataset. The result is robust against changes in the size of the ﬁlter window over a range of one to eight hours. It is appropriate to correct the probability value for the freedom allowed in choosing the starting point of the consecutive 24-hour data periods. With all possible starting periods considered, the Bonferroni-corrected [10] p-value is approximately 0.00024, corresponding to an equivalent z-score of 3.5. 3.4. Device Variance The exceptional nature of the September 11 data is more sharply deﬁned in the variance across the independent RNG devices. Figure 3 is a cumdev plot of the device variance relative to its empirical mean, showing a broad peak centered on the time of the September 11 events. Beginning at about 05:00 EDT the second-by-second variance increases sharply and the cumulative deviation continues to rise until about 11:00, when the variance shifts to values below expectation in a trend that persists until about 18:00. A bootstrap permutation analysis, reordering the actual data, yields an estimate of p =0.0009 (z-score = 3.1σ) for the peak absolute excursion, based on 10,000 iterations. A calculation relative to theoretical expectation for this excursion occurring in a 24-hour period gives a similar value of p =0.0007. In contrast, for the September 11 pseudo-random control data [11] the corresponding estimate is p =0.756, reﬂecting chance behavior (the small negative peak visible near the time of the attacks is well within the range of expected ﬂuctuations). A daily measure of variance excursions, closely related to the measure shown in Fig. 3, shows that September 11, 2001 has the greatest deviation from expectation of any day in the database from August 1998 to June 2002. For this measureﬀ we apply a 6-hour low-pass ﬁlter to the raw variance to capture the e ect of the long monotonic trends seen in the cumulative deviation ﬁgure. The result on September 11, 2001 shows a change of 6.5 σ over a period of about 7 hours. The same procedure applied uniformly to the full database shows that September 11, 2001 is unique among the 1336 days of collected data (p =0.0007, or 3.18 σ). 3.5. Autocorrelation We can assess the unusual behavior of the variance from another per- spective by calculating its autocorrelation, which is sensitive to the details of the trends over time that deﬁne the shape of the curve. The autocorrelation function for the September 11 data shows a substantial response driven by extended monotonic excursions in the raw device

Fig. 3. Cumulative deviation of device variance across RNG nodes, relative to the empirical mean value, for each second on September 11, 2001. The truly random data from the RNG array are contrasted with a pseudo-random control dataset computed for the same array of data elements processed with the same algorithms. Axis labeling is in EDT. Times of the terrorist attacks are indicated with boxes on the zero line. variance. As shown in Fig. 4, the autocorrelation cumdev displays a rapid increase up to a lag time of about one hour, with a more modest rising slope continuing for up to two hours of lag. Inspection of the previous plot (Fig. 3) conﬁrms that the strong, persisting deviations in the data occur on a timescale of one to two hours. An indication of the likelihood of several such excursions occurring on one day, as happened on September 11, is given by the magnitude of the cumulative deviation of the autocorrelation: the trace in Fig. 4 repeatedly penetrates a 0.0005 probability threshold. For comparison, the ﬁgure includes traces showing the same computations for 60 surrounding days, nearly all of which remain within a 5% probability envelope. Computations for the ﬁducial 400 days, considering all possible starting points for the consecutive 24-hour blocks over which the autocorrelation is calculated, show no other days with trends outside a 1% envelope. Applying a Bonferroni correction for the selection of starting points, a p-value of 0.001 (3.1 σ) can be assigned to the autocorrelation on September 11. 3.6. News Correlation We observe by inspection that world events noted in the prediction registry tend to occur on days with signiﬁcantly higher average pair correlations among the RNGs. To assess this relationship quantita

Fig. 4. Cumulative deviation of autocorrelation of the device variance. The cumulative sum is shown as a function of lag time for September 11, 2001, contrasted with the same calculation for 60 surrounding days. The autocorrelations were calculated for 24 hour EDT days over lags of up to four hours. tively, an objective metric was constructed based on an independent daily assessment of newsworthy events by a professional news service not associated with this project [12]. The count of letters used in the daily summaries of news items was taken to represent the news “intensity”. Over the one-yearﬀ period from Dec. 2000 through Nov. 2001, this measure, though di use, is correlated with daily mean pair correlations of the RNG data at r =0.15, t(362df)=2.94, p =0.002, z-score = 2.9 [13]. We note that this statistic is independent of the selection of events in the prediction registry, but fully consistent with the results for the pre-speciﬁed analyses since it correlates a measure of the importance of world events with deviations in the database. 3.7. Source Distribution To characterize the source distribution of the deviations, we examined data from individual RNGs, as well as subsets of RNGs designated by location or by random assignment. The departures from expectation are distributed generally across the independent devices and there are no signiﬁcant outliers that dominate the statistics. A complexity measure used to reduce dimensionality in multi-channel neurophysiological data [14] was computed with the RNG devices treated as channels. This measure shows that a parameter closely related to variance at

the device level is by far the largest contributor in a standard three-parameter representation. In the time domain, our analyses of the 400-day database indicate that, in contrast to the result for longer lag times, the data exhibit no signiﬁcant autocorrelations on a time scale of seconds for either the network or device variances, or for individual RNGs. This indicates that the observed anomalies are not driven by short time characteristics of the RNG electronics. 4. DISCUSSION In summary, we ﬁnd evidence for a small, but replicable e ect on data from a global network of random generators that is correlated with designated periods of intense collective human activity or engagement, but not with any physical sources of inﬂuence. The 109 experimental replications as of June 2002 distribute normally, but have a shifted mean z-score of 0.53, representing a ﬁve σ departure from expectation. We attribute this result to a real correlation that should be detectable in future replications as well as in analyses using correlates independent from the project prediction registry. The random generator network has been conceived to produce stable output under a variety of conditions and it is unlikely that en- vironmental factors could cause the correlations we observe. Conventional mechanisms might be sought in environmental factors such as interactions due to major changes in the electrical supply grid, surges in mobile phone and telecommunications activity, or unusual coherence in radio and television broadcasting, all of which may accompany periods of intense regional or global attention. However, the instrument design includes physical shielding of the RNG devices from EM interference, and at all nodes the data pass through a logic stage that eliminates ﬁrst-order biasing from electromagnetic, environmental, or other physical causes. The devices are distributed around the world, so their separation from sites of physical disturbance varies greatly (for example,ﬀ the mean distance of the RNGs from New York is 6400 Km), yet the e ects described here are broadly distributed across the network. These logical and empirical indications are conﬁrmed by analytical results. Time series analysis based on 365 days of RNG outputs registered at local time shows that there is no correspondence with expected diurnal variations in the power grid or other known cyclic patterning (p =0.30) [13]. Barring demonstration of a conventional interaction that can affect the random generators on a global scale, we are obliged to confront the possibility that the measured correlations may be directly associated with some aspect of consciousness attendant to global events. In particular, this evidence, if conﬁrmed, would support the idea that some processes in nature that have been assumed to be fundamentally random are in fact somewhat mutable. If the present understanding of

quantum randomness is called into question, there are profound theoretical and practical implications [15, 16, 17]. However, there needs to be signiﬁcant replication and extension of our results before these novel theoretical positions can be seriously considered. Although some progress can be made to elucidate the form that an explanatory theory might take [18, 19], it clearly must be guided by further experimentation and deeper examination of the data in hand. The post hoc analyses presented here indicate possible extensions of this research. For example, the September 11 results imply that there is a correlation between the intensity or impact of an event and the strength of deviations present in the data. The September 11 event is arguably the most extreme in the database in terms of its social, psychological, emotional, and global impact. As the analysis has shown, it also exhibits the largest and most consistent deviations in the database on the statistical measures we have investigated. It will be important to develop strategies to test this conjecture over the full set of replications and inﬀnewly acquired data. The September 11 analysis also suggests that the e ect detected in the formal replications is distributed over the database and is not isolated to the prediction periods. The statistical signiﬁcance of these excursions is limited to roughly three normal deviations. Thus,ﬃ as isolated, post hoc analyses, none of these individually would be su cient to conclude a causal or other direct link between the September 11 events and the measured deviations. In light of the formal result, however, these analyses do suggest that independent metrics spanning the database and consistent with the experimental hypothesis may reveal other correlations with our statistical measures. This suggestion is supported by the news index analysis in which deviations in the RNG data correlate with an objective measure of news intensity. It is likely that more sophisticated metrics with optimized statistical power could provide independent veriﬁcation of the results generated by the ongoing experiment as well as the capability to probe secondary correlates in the data. Our ﬁndings are summarized in Table I, which includes an indication of their likelihood in the context of the comparison standards used. To identify the source of the deviations we must account for excess inter-node correlations, persistent changes in composite variance, and long-term autocorrelations, all indicating signiﬁcant alteration in the informational entropy of the data array. Although the aggregate result attains a level of ﬁve normal deviations, signiﬁcant by any standard, extensive further replication is needed before proposals of a causal or otherwise direct link between human consciousness and the output of the network generators can be convincingly advanced. We present this work as an invitation to other researchers to examine the data in a broad-based search for better understanding. The observations reported here are unexplainedﬃ and may seem to defy conventional modeling, but the evidence is su ciently compelling to justify further investigation. More detailed analyses of the accumulating database

are proceeding. We would be grateful for access to other continuously recorded, nominally random data sequences that can be examined for correlations similar to those reported here. Table I. Summary of statistical measures Measure Probability Comparison standard Network variance, 9/11 0.003 Resampling: 400 days Device variance peak, 9/11 0.0009 Permutation: control p = 0.756 Autocorrelation, 9/11 0.001 400 control days: p > 0.01 Inter-node correlation, 9/11 0.0002 Student t: 400 days News intensity correlation 0.002 Student t: 365 days Diurnal variation 0.30 Time series: 365 days Composite Chi-square 2.7 × 10−7 109 Replications to June 2002 REFERENCES 1.C. Vincent, “The generation of truly random binary numbers,” J.Phys. E 3, 594-598 (1970). 2.H. Schmidt, “Quantum-mechanical random-number generator,” J.Appl. Phys. 41, 462-468 (1970). 3.D. I. Radin and R. D. Nelson, “Evidence for consciousness-related anomalies in random physical systems,” Found. Phys. 19, 14991514 (1989). 4.R. G. Jahn, B. J.Dunne, R. D. Nelson, Y. H. Dobyns, and G. J.Bradish, “Correlations of random binary sequences with pre-stated operator intentions: A review of a 12-year program,” J. Scient. Expl. 11, 345-367 (1997). 5.R. D. Nelson, “The Global Consciousness Project” (Web publication: Technical documentation and data access; http://noosphere. princeton.edu, 1998-present). 6.Princeton Engineering Anomalies Research, “Construction and Use of Random Event Generators in Human/Machine Anomalies Experiments” (PEAR Tech. Rep. 98002, Princeton University, Princeton, NJ, 1998). 7.R. D. Nelson, “Correlation of global events with REG data: An Internet-based, nonlocal anomalies experiment,” J. Parapsych. 65, 247-271 (2001). 8.R. D. Nelson, “Coherent Consciousness and Reduced Randomness: Correlations on September 11, 2001,” J. Scient. Expl., in press (2002).