It’s all in the vibrations

Every modern smartphone is equipped with an inertial measurement unit (IMU) consisting of at least an accelerometer sensor, often accompanied with a gyroscope sensor.

Accelerometers measure the acceleration, also known as G-force, encountered by the phone in three orthogonal directions as illustrated by figure 2.

Figure 2. Phone sensors measure acceleration on 3 orthogonal axes.

I These sensors are extremely sensitive, picking up every small vibration of the phone. Figure 3 shows the difference in 3-axes accelerometer and gyroscope data for a trip containing a train segment, a biking segment and a walking segment.

Figure 3. Accelerometer and gyroscope data for a trip containing train, biking and walking segments.

The data corresponding to the walking trip contains a set of low frequency components not present in the train trip. Similarly, car data often contains certain high frequency components that are not present in tram or train data which indicates that frequency domain features would probably be very valuable for activity detection.

This is illustrated more clearly in figure 4, where the frequency spectrum of the accelerometer x-axis of a walking segment (left) is shown, together with the frequency spectrum of a car trip (right).

Figure 4. Frequency spectra for accelerometer data from a walking trip (left) and car trip (right).

The walking spectrum shows several high magnitude low-frequency components and harmonics, whereas these are absent in the car data. Similar differences can be observed when comparing spectrograms of biking, train and tram trips, although these differences are much more subtle and difficult to describe with a set of rules or manually engineered features.

Sampling and pre-processing

How can we extract these frequency components, and which sampling rate should be used? Nyquist requires us to sample the data at a sampling rate that is at least twice as large as the highest frequency component we want to be able to extract from the data, which suggests to capture the data at a high sampling rate (e.g. 200 Hz).

On the other hand, battery usage and bandwidth constraints are important factors to consider for mobile applications. Moreover, accelerometer sensors are known to be very sensitive to noise, most of which manifests itself in the high frequency region of the spectrum. For transport mode classification, the importance of frequency components above 13 Hz decreases exponentially. Therefore, we sample the accelerometer sensors at 26Hz.

Although we ask Android and IOS for a 26 Hz signal, we often receive higher frequency data, depending on the phone model and other applications that run on it. Moreover, even for a 26 Hz signal, incoming samples are not evenly spaced every 38.5 milliseconds. We therefore interpolate the signal onto a regular grid, and resample it to get the desired 26 Hz signal.

However, naively downsampling the signal introduces artifacts due to aliasing. Any frequency component above half the sampling frequency would get folded around the Nyquist frequency and incorrectly appear as a lower frequency component in the resulting spectrum.

More concretely, given a sampling frequency fs, an observed frequency in the spectrum can be caused by any real-world frequency component f that satisfies fo = |f – N.fs|, where N is an integer number.

For example, if the original sampling frequency, delivered by the mobile OS, is fs = 36 Hz instead of the requested 26 Hz, then we capture frequency components up to 18 Hz. If we then downsample that signal to 26 Hz, we can only represent frequency components up to 13 Hz. Any component between 13 Hz and 18 Hz however, would now appear as a low frequency artifact in the resampled data.

In this example, a frequency component of 17 Hz in the original 36 Hz signal would appear as a 9 Hz component in the resampled signal, thereby invalidating the true 9 Hz frequency component in the spectrum. This is illustrated with a synthetic example in figure 5, where a 17 Hz sine wave is modulated upon a 3 Hz carrier signal.

Figure 5. Illustration of aliasing: High frequency components get folded around the Nyquist frequency to low frequency components.

Since the actual sampling rate depends on the mobile operating system and does not necessarily correspond to 26 Hz exactly, we remove these artifacts by applying a low pass filter with a cut-off frequency (-3dB point) of 13 Hz to the signal before resampling.

Another source of unwanted signal components is the gravity pull that directly influences accelerometer measurements. A perfectly still and horizontally aligned phone will show an accelerometer measurement of 9.81 on its Z-axis. Once the phone is tilted, this gravity pull affects multiple axes simultaneously, mixing together with and masking the magnitude of the actual accelerations.

However, phone orientation changes and the corresponding changes in gravity pull manifest themselves as low frequency components in the signal, which in general contain little or no discriminative power for the transport mode classifier. In the extreme case of no phone orientation change at all, the gravity pull would simply appear as a DC offset in the resulting spectrum. We therefore roughly remove the gravity component by simply applying a high-pass filter to the signal with a cut-off frequency of 0.2Hz.

For efficiency reasons, the low-pass and high-pass filters are combined into a single band-pass IIR filter. This is illustrated by figure 6 which shows a 10-second walking segment before and after bandpass filtering.

Figure 6. Accelerometer data before and after band-pass filtering to remove low and high frequency components from the data.