Large magnitude earthquakes in urban environments continue to kill and injure tens to hundreds of thousands of people, inflicting lasting societal and economic disasters. Earthquake early warning (EEW) provides seconds to minutes of warning, allowing people to move to safe zones and automated slowdown and shutdown of transit and other machinery. The handful of EEW systems operating around the world use traditional seismic and geodetic networks that exist only in a few nations. Smartphones are much more prevalent than traditional networks and contain accelerometers that can also be used to detect earthquakes. We report on the development of a new type of seismic system, MyShake, that harnesses personal/private smartphone sensors to collect data and analyze earthquakes. We show that smartphones can record magnitude 5 earthquakes at distances of 10 km or less and develop an on-phone detection capability to separate earthquakes from other everyday shakes. Our proof-of-concept system then collects earthquake data at a central site where a network detection algorithm confirms that an earthquake is under way and estimates the location and magnitude in real time. This information can then be used to issue an alert of forthcoming ground shaking. MyShake could be used to enhance EEW in regions with traditional networks and could provide the only EEW capability in regions without. In addition, the seismic waveforms recorded could be used to deliver rapid microseism maps, study impacts on buildings, and possibly image shallow earth structure and earthquake rupture kinematics.

Keywords

The MyShake network builds on initial work at the University of California (UC), Berkeley that determined the quality of the accelerometers in smartphones ( 16 ). We have extended this work to develop an android-based application that runs efficiently on the users’ smartphone and detects whether the movement of a phone is likely caused by an earthquake or by other human activities. The application sends this information back to our processing center where a network detection algorithm confirms that an earthquake is under way. The location, origin time, and magnitude of the earthquake are then determined on the basis of multiple triggers from the network of phones. This information can be used to estimate the shaking intensity and the remaining time until damaging waves arrive at a target location. Here, we detail (i) size and proximity requirements for earthquake signals to be recorded by smartphones, (ii) development of our on-phone detection capability to distinguish earthquakes from other shakes, and (iii) the design of a network detection algorithm to operate at the processing center to confirm when an earthquake is under way, and to locate and characterize it. This has been achieved within the real-world constraints of building an android application that runs in the background on private phones without draining power.

We built MyShake on other crowdsourcing projects in seismology. The Quake-Catcher Network (QCN) and Community Seismic Network (CSN) primarily use low-cost microelectromechanical system (MEMS) accelerometers that plug into computers and can be installed in buildings to detect earthquakes ( 8 , 9 ). These networks consist of a few hundred to a few thousand accelerometers and are limited by the need to pass hardware from the network operators to the users. By using the sensors in smartphones, we only need to pass software from the network operators to users, which is relatively simple using the Google Play and iTunes store. The CSN also explored the use of smartphone accelerometers by asking whether newly incoming data are similar to previously defined human activities. If not, the data are treated as an anomaly and communicated to a processing center where a picking algorithm will determine if the data represent an earthquake or not ( 10 , 11 ). Our MyShake design is different in that we use past earthquake information to develop a classifier algorithm to identify earthquake shaking on a single phone and then communicate with a centralized processing center (CPC). Previous work has also demonstrated that the Global Positioning System (GPS) sensors on smartphones (rather than the accelerometer) can be used to detect earthquakes and potentially provide a warning ( 12 ). To date, this has been shown to be possible on dedicated smartphones but not on personal smartphones. Another crowdsourcing project is using twitter to detect earthquakes. A tweet-frequency time series constructed from tweets containing the word earthquake in various languages and an algorithm is used to identify possible earthquakes ( 13 ). Finally, the U.S. Geological Survey “Did You Feel It” system is a web-based approach for collecting reports of shaking and damage as experienced by individuals. The reports are converted into intensity and used to generate detailed shaking intensity maps when enough people report ( 14 , 15 ). The intensity estimate relies on subjective descriptions by the reporter. By using smartphone sensors, MyShake uses the power of crowdsourcing while also reporting shaking time series and accurate locations.

Large magnitude earthquakes in densely populated regions do not occur very frequently, but they can kill tens to hundreds of thousands of people, injure many more, and cause substantial financial loss ( 1 ). Earthquake early warning (EEW) systems can detect the location and magnitude of an earthquake in a few seconds and issue a warning to the target area before the damaging waves arrive ( 2 , 3 ). This new technology can reduce the fatalities, injuries, and damages caused by earthquake, doing so by alerting people to take cover, slowing and stopping trains, opening elevator doors, and many other applications ( 4 ). The development of EEW to date has largely focused on the use of traditional seismic and geodetic networks, which exist only in a handful of countries around the world ( 5 ). Smartphones are much more prevalent and have a variety of built-in sensors and communications. There were 2.6 billion smartphones worldwide in 2014, and this number is expected to pass 6 billion by 2020 ( 6 ). Here, we report on the development of MyShake, a crowdsourcing project ( 7 ), to harness the accelerometers in personal smartphones to record earthquake-shaking data for research, hazard information, and EEW.

RESULTS

To better understand which earthquakes we can record on smartphones, we determined the noise floor (17) of the accelerometers on multiple android phones by placing them in a basement and allowing them to record for 1 month. The noise floor of the phones contains the internal noise of the phone itself plus other environmental sources in a quiet basement. Once this level is known, we can assess the necessary size of earthquakes such that the ground-shaking amplitude exceeds the noise. Figure 1 compares the noise floor of the test phones to the amplitude of shaking for various magnitude earthquakes at 10 km (18). All phones are sensitive to the shaking for magnitude 5 (M5) or larger earthquakes at 10 km or less from the phone in the frequency range of 1 to 10 Hz, and all are capable of recording the longer periods of larger magnitude events. There is a gradual improvement in the sensor capabilities with the release date of the phone (see the color change from cold to warm). The more recent phone models are sensitive to shaking for M3.5 at 10 Hz. The in-phone accelerometers can record shaking for the earthquakes that do damage in the frequency range that causes most damage (1 to 10 Hz). Also, we expect the quality of the sensors in phones to improve further with time. The HP MEMS accelerometer (blue; Fig. 1) was recently developed for seismic imaging applications (19). It is currently too expensive for inclusion in smartphones but illustrates that MEMS sensors can have similar capabilities to more traditional strong motion sensors [Berkeley Seismological Laboratory (BKS) station; Fig. 1].

Fig. 1 Noise floor of the phones. Noise floors of the smartphones color coded by the phone release date (also shown in the legend as MM/YY). Dashed black lines are typical ground motion amplitudes of earthquakes 10 km from the epicenter for various magnitudes. Noise floor for high-quality MEMS sensor (HP MEMS, blue) and a typical force-balance accelerometer from a regional network (BKS in northern California, purple) are also shown.

Next, we determined how well phones can record the true shaking during an earthquake. Both the quality of the sensor and how well the phone is coupled to the ground play key roles here. To anwser this question, we put multiple phones on shake tables some bolted to the table, and others not bolted down and able to slide freely. Our results confirm previous findings (16, 20) that phones bolted to shake tables are capable of recording ground motion accurately between 0.5 and 10 Hz. We also tested phones placed freely on the shake table because personal phones are not bolted to the ground. Figure 2 shows a three-dimensional (3D) shake table test with the peak acceleration of 0.5g. The phone under testing had some relative motion with the table, but minimal. We can see that the waveform of the phone and the reference accelerometer are very similar, and the frequency response of the phone acceleration is good from 0.5 to 10 Hz. In a 1D shake table test with a sweep signal (gradually increasing amplitude and frequency), we found that it was not until the horizontal accelerations reached a specific threshold, in this case ~0.3g and above ~3 Hz, that we started to see sliding. When the phones slid, it had the effect of clipping the peak amplitudes, although the frequency content remained similar (Fig. 3). This is a limitation of the data recorded, and we must recognize that recorded amplitudes are lower bounds on the actual value.

Fig. 2 3D shake table test. The input seismogram is from a real earthquake that has been modified for IEEE-693-2005 tests. (A) Waveform comparison between phone (blue) and reference accelerometer (red) recordings from an input signal that has peak acceleration of 0.5g. (B) Spectrum comparison of Y components. The X and Y components are in the plane of the phone, which is lying flat on the horizontal shake table and is not attached. The Z component is perpendicular to the plane of the phone and is vertical for this test.

Fig. 3 Shake table test with an input sweep signal (0.5 to 7 Hz). (A) Waveform comparison between a phone fixed on the table (blue), a phone placed freely on the table (black), and the reference accelerometer attached to the table (red). (B) Frequency domain comparison of the signals in (A). (C) Calculated correlation coefficient and RMS (root mean square) ratio between the signals recorded by the phone placed freely on the shake table and the reference accelerometer. The correlation coefficient is a measure of the phase match, and RMS is a measure for amplitudes match. We used a 1-Hz frequency band to filter the record and calculate the coefficient with a step frequency of 0.1 Hz. The x axis is the center frequency of the frequency band. The correlation coefficient shows how well the phase is recorded by the phone, and the RMS ratio shows the amplitude recovery. Above 2 to 3 Hz, the phone starts to slide, so the full amplitude is not recovered; however, the phase is recovered up to 7 to 8 Hz.

Given that a smartphone can record earthquake shaking, the key challenge for a smartphone network using private/personal phones is the ability of installed phone applications to separate earthquake shaking from everyday motion of the phone. Figure 2A shows 12 hours of three-component acceleration data that were recorded on a smartphone. The figure shows both human activities and the M6.0 Napa earthquake on 24 August 2014 (21) at the very end of the waveform. Figure 2B shows the zoomed-in view of the accelerations associated with the Napa earthquake recorded on the same phone.

To develop an algorithm to separate earthquake shaking from human activities, we first developed an application for android smartphones to trigger on significant motions and send the data to a CPC. It has been designed for distribution to personal/private phones and has a trigger algorithm that runs in the background, continuously monitoring the accelerometer. It uploads parameters and data to our CPC when triggered. Functionality at the CPC allows us to (i) monitor and change the operational parameters on the users’ phones, (ii) collect heartbeat and state-of-health information from the phones, (iii) collect autonomous phone-trigger information, (iv) trigger phones from CPC to record data, and (v) upload waveform data for autonomous and CPC triggers. A small release of MyShake in November 2014 deployed the application on 75 phones (fig. S1). A key issue for a crowdsourcing application to be successful is minimizing the impact on the users: for phones, this means minimizing power usage. The MyShake application currently uses about the same power that a smartphone uses when it is on but is not being used. For most users, a phone running MyShake does not need to be charged more than once every 24 hours.

Using the data collected, we developed an artificial neural network (ANN) approach to identify the different characteristics of earthquake and human motions (see the Supplementary Materials for details). The algorithm assesses 2-s windows of data and determines if the motion is likely an earthquake or not. We first train our algorithm with data from three sources: everyday motion recordings uploaded to our CPC from the MyShake release as described above, phone recordings of earthquakes from shake table tests, and seismic data from traditional networks in Japan that were modified to reproduce smartphone-quality records, which are described in the Supplementary Materials. We tested a total of 18 characteristics identifying the three best features: the interquartile range of the acceleration vector sum (IQR), the maximum zero crossing rate (ZC), and the cumulative absolute velocity of the acceleration vector sum (CAV). IQR is an amplitude parameter that shows the middle 50% range of amplitude of the movement. ZC is a simple frequency measure that counts the number of times that the signal crosses baseline zero. CAV is a cumulative measure of amplitude on the three components in the time window and is determined as follows: (1)where α(t) is vector sum of the three-component acceleration.

Figure 4C shows how IQR (a measure of amplitude) and ZC (a measure of frequency) separate earthquakes from non-earthquake motions. Earthquakes are high frequencies with moderate amplitudes, whereas everyday motions are lower frequencies but high amplitudes or very low amplitudes but high frequencies. The IQR and ZC are the best two parameters to separate earthquakes from ordinary motion, but adding CAV can provide some additional information to help improve performance (Fig. 4D).

Fig. 4 Earthquake recorded by phone and classifying earthquakes. (A) Example of 12-hour three-component acceleration record from a private/personal Samsung Galaxy S4 phone starting at 4:00 p.m. (23 August 2014). It shows the accelerations of everyday human motions for the first ~8 hours, then appears stationary during the night. The red box at the end of the figure highlights the time window of (B). (B) One minute of data from the period shown in (A) at the time of the M6 Napa earthquake 38 km from the phone. The earthquake occurred at 3:20:44 a.m. local time. (C) Scaled feature plot showing IQR versus ZC for the classifier training data set. The blue dots are the centroids of human activities, and the red dots are the earthquake features. (D) 3D plot of the three features we used to distinguish earthquakes. Adding the CAV to IQR and ZC drags some of the human activates (blue dots) to the third dimension but not the earthquake data, which helps improve the results. EW, east-west; NS, north-south; UD, up-down.

The trained ANN algorithm is then applied to U.S. earthquake data modified to phone-quality records as well as to a separate set of everyday motion data (Table 1). Ninety-eight percent of the earthquake records within 10 km of the events are recognized as earthquakes; the success rate reduces with increasing distance and decreasing magnitude as expected. Ninety-three percent of the everyday motions are correctly recognized, meaning that for an operational system, we expect ~7% of phone triggers to be false (earthquake) triggers.

Table 1 Performance of the ANN algorithm. Performance of classifiers when applied to earthquake and non-earthquake data not used to train the ANN algorithm. In the case of earthquake data, the percentage of records that were correctly classified as earthquakes is shown along with the number of records (in parentheses) for various earthquakes recorded within various distances of the epicenter. For the everyday human activity data, the percentage correctly identified as non-earthquake and falsely identified as earthquakes is shown. View this table:

The final component of our system is a network detection algorithm running at the CPC to confirm when an earthquake is under way and to estimate source parameters from multiple triggered phones in a region. When a phone determines that it is recording an earthquake, two types of data are passed to the CPC: (i) the trigger information including trigger time, phone location, and the maximum amplitude of the three components and (ii) the waveform data that contain three-component acceleration from 1 min before the trigger to 4 min after. The trigger information is easier to upload rapidly via cellular or Wi-Fi networks and is what we use for real-time processing. The waveform data are currently uploaded with a lower priority and only uploaded when the phones are connected to Wi-Fi and power.

Our first-generation network detection algorithm is based on current EEW ElarmS-2 methodologies (22). It searches for a temporal and spatial cluster of triggers and requires greater than 60% of operating active phones to have triggered within a 10-km radius region for an event to be declared (see the Supplementary Materials for details). Once an event is created, the algorithm will continue to update the origin time, location, and magnitude of the earthquake based on the continuous flow of trigger information. Currently, the origin time is set to the earliest trigger time, and the centroid of the all the triggered phones within 10 km of the phone trigger is used as the epicenter. Our first-generation magnitude estimation is based on expected ground-shaking amplitude as a function of distance. We use the peak ground acceleration (PGA) and the distance of the station to estimate the magnitude using the following regression relation based on the earthquake data from Japan that were modified to reproduce smartphone-quality records (2)where PGA is the maximum absolute amplitude from the three-component acceleration, and distance is the epicentral distance derived from the phone location and estimated location of the earthquake. Figure 5 compares estimated magnitude and the real magnitude for both individual phone (blue dots) and the average event estimates (red pluses). We can see that most of the estimated magnitudes are within 1 magnitude unit for individual phone, and all average event estimates are within 1 magnitude unit. When the network consists of many more phones, we might expect the uncertainty in the magnitude to be reduced. However, we must also recognize that phone-based amplitude estimates must be treated as lower bounds given the possibility of decoupling. Given these uncertainties, it is clear that having even a single observation from a traditional seismic station could make a significant difference, providing some “ground truth” to the magnitude estimate.

Fig. 5 Estimated magnitude. Comparison of our estimated magnitudes with the real magnitude for earthquakes in Japan using phone-like data. The green line is the 1:1 line, and the two gray lines are the 1 magnitude unit shift. Each blue point is the magnitude estimate at a single simulated phone. The red pluses are the average event estimates, which is the average of multiple single phone estimates.

The final step for an alert is to estimate the shaking intensity and time until shaking at a user’s target location. This is relatively straightforward using the estimated event epicenter, origin time and magnitude, the user’s location, and S-wave travel time curves and ground motion prediction equations (23) just as with the current EEW system in California.

It is a known problem that magnitude estimates based on peak shaking observations from seismic stations saturate (24, 25): This will also be a problem for MyShake. There are several possible improvements. First, the smartphone-based magnitude estimate could be improved by updating the magnitude on the basis of the area experiencing strong shaking. Stronger magnitude earthquakes cause strong shaking over large areas. Another possibility is to make use of GPS-based permanent ground displacements as is being currently done with the more traditional network-based early warning systems (26, 27). It was recently shown that smartphone-based GPS observations could be used for EEW (12). The challenge when using only GPS on smartphones is that GPS is very power-hungry. A possible hybrid would be to start monitoring the GPS on a phone when the MyShake classifier identifies an earthquake. This could provide an updated magnitude estimate that does not saturate and would not suffer from the power issues associated with a GPS-only approach.

We applied the network detection algorithm in a simulated real-time manner to phone-like triggers for U.S. earthquakes (Table 1). Almost all stations close to the epicenter (within 10 km) were triggered. Figure 6 shows performance snapshots for the M5.1 La Habra earthquake (28), which had the poorest success rate in triggering on individual phone-like waveforms due to the relatively small magnitude compared with other test earthquakes (Table 1). The figure shows the location of the triggers at each time step clearly showing the radiating nature of the ground motion and associated triggers. The earthquake is first identified 5 s after the origin time (Fig. 6B). The error in the initial magnitude estimate is 0.1 magnitude units, the location error is 3.8 km, and the origin time error is 1.7 s (table S2). The performance of this MyShake simulation is similar to the actual performance of the real-time ShakeAlert/ElarmS EEW system (29), which issued its first alert 5.3 s after the origin time with an initial magnitude error of 0.8, location error of 1.5 km, and origin time error of 0.2 s. In reality, when we have a denser phone network, we would expect the application to detect the earthquake faster. Movies S1 and S2 show performance animations for the 2014 La Habra and 2004 Parkfield (30) events, respectively.

Fig. 6 Snapshots of trigger detections for the 2014 M5.1 La Habra earthquake simulation at 3, 5, and 7 s after the event origin time. Gray dots are stations, and pink indicates a trigger. The true earthquake (EQ) location is the red star with circles at 10-, 20-, and 30-km radius. The blue star represents the estimated event location first detected at 5 s. The magnitude estimate at each point in time is shown in the upper right.

We also conducted 1000 simulations that incorporate random human activity triggers as well as earthquake triggers to explore system performance for different densities of phones (see the Supplementary Materials). We found good performance (similar to the La Habra example) when there are 300 or more phones in a 111 × 111–km region, corresponding to an average distance between phones of 6.4 km (table S3). If the number of phones drops to 200 in the same region, then out of 1000 simulations, we found 32 events that were not detected, that is, 3% of events were missed. In addition to missing some earthquakes, the accuracy of the locations and origin times is degraded. We also conducted a second group of 1000 simulations without earthquakes, just false triggers. None of these generated a false event. This is because we require >60% of active phones within a 10-km radius region to trigger for an event declaration. Our ultimate design goal is to have much smaller distances between active phones (less than 6.4 km), yet we must recognize that the network algorithm will need to be modified to reflect the active network, and these changes may need to happen in real time.