Extracting My Data from the Microsoft Band

When the Microsoft Band was announced, I was thrilled to discover the first wrist-worn device to have both a heart-rate sensor and GPS, plus a slew of other sensors. My research group has been investigating how to make recommendations for people to improve their sleep from smartphone and smartwatch tracking data. My Ph.D. student Alexandra managed to snag a Band when they were hard to find, but I was disappointed when I learned that it suffered from the same problem that plagued so many promising wearable devices: the inability to export my own minute-by-minute data.

The Band syncs to its own smartphone app called Microsoft Health, but was clear after a bit of searching that no one knew a way to get their data out. So you can't run your own analysis or make it interoperable with your own applications. I asked someone who worked on the Band at Microsoft Research whether he knew of a creative way to get data out of it, and he responded "You're right that we don't expose raw data at this point, but looking forward to seeing what you come up with... :)"

I took this as a challenge, and asked Alexandra to dump out the data from the phone app to find out how the data was stored. She managed to export a bunch of files, but after digging around, we found cached data with daily summaries, but not the raw minute-by-minute data I knew was being stored somewhere because of the sleep chart in the Microsoft Health app (left screenshot below).

I decided to dig further and decompiled the app to understand where the data went by reading the app code, so my research group could use the data for research. I used an app on my phone called ES File Explorer to get the application package (the apk file) from the phone to my computer, which is only one of dozens of way to pull the apk from the phone (right screenshot above).

An apk is just a zip file, and here's what the Microsoft Health apk file looked like when unzipped.

The main code for the app is in a file called classes.dex which is a Dalvik Executable file, basically a compiled Java binary. The file format is fairly well defined and I was lucky to find an open source tool called jadx to decompile the source code.

After decompiling the classes.dex file, I browsed through a few folders and came across this in the "microsoft/" folder, which seemed like the root directory for the Microsoft Health app.

$ ls -l total 0 drwxr-xr-x 18 jeff staff 612 Dec 9 17:14 cargo drwxr-xr-x 3 jeff staff 102 Dec 9 17:14 exceptions drwxr-xr-x 3 jeff staff 102 Dec 9 17:14 instrumentation drwxr-xr-x 199 jeff staff 6766 Dec 9 17:14 kapp drwxr-xr-x 5 jeff staff 170 Dec 9 17:14 krestsdk

There were hundreds of files inside each of these folders, so I grepped for keywords like "sleep", and then "sleepEvents" when I noticed that was a frequently occurring term.

The "aha" moment was when I encountered this getSleepEvents function in /src/com/microsoft/krestsdk/services/KRestServiceV1.java

public void getSleepEvents(LocalDate localdate, LocalDate localdate1, Callback callback) { if (localdate.isAfter(localdate1)) { throw new IllegalArgumentException("startDayId cannot be after endDayId."); } else { ODataRequest odatarequest = new ODataRequest("/v1/Events"); odatarequest.addArgumentQuotes("eventType", EventType.Sleeping.toString()); Object aobj[] = new Object[2]; aobj[0] = KRestServiceUtils.formatDate(localdate); aobj[1] = KRestServiceUtils.formatDate(localdate1); odatarequest.setFilter("DayId ge datetime'%s' and DayId le datetime'%s'", aobj); odatarequest.addParameter("expand", "Sequences,Info"); NetworkProvider networkprovider = mNetworkProvider; CredentialStore credentialstore = mCredentialStore; CacheService cacheservice = mCacheService; String as[] = new String[3]; as[0] = "SYNC"; as[1] = "EVENTS"; as[2] = CacheUtils.getEventTypeTag(EventType.Sleeping.toString()); (new KRestQueryOData(networkprovider, credentialstore, cacheservice, Arrays.asList(as), CUSTOM_GSON_DESERIALIZER, odatarequest, new TypeToken() { final KRestServiceV1 this$0; { this$0 = KRestServiceV1.this; super(); } }, callback)).executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR, new String[0]); return; } }

Clearly, to get sleep events, the app is constructing a REST call. So what must be happening is the Band syncs with the app on the phone, which syncs with some servers that Microsoft owns. This explained why we were not able to find the raw data in the file dump of the app, and only the cached data was stored locally.

My next intuition was to try and intercept the data between the app on my phone and the Microsoft server to see what was being transmitted. This is usually done by using a proxy in an application, so I first tried enabling the proxy on my Android phone (left screenshot below).

After a bit of testing, it was clear that the proxy feature in Android only affects the web browser, and not the Microsoft Health app. So I tried a different trick: setting the gateway (i.e., router) for the phone's wifi to be my computer instead of using DHCP, so that all the network data would be sent to my computer. I edited this setting (right screenshot above) and enabled IP forwarding so network packets could still reach the Internet instead of hitting my computer and getting lost.

$ sudo sysctl -w net.inet.ip.forwarding=1 net.inet.ip.forwarding: 0 -> 1

Next I checked that the browser was still working on my phone, and it was so that was a good sign. Then the tricky part; I set up a packet filter to forward the incoming packets to a different port on my computer. In a .conf file,

rdr on en4 inet proto tcp to any port 80 -> 127.0.0.1 port 2300

And then running the following pfctl (a packet filtering utility):

$ sudo pfctl -f pf.conf pfctl: Use of -f option, could result in flushing of rules present in the main ruleset added by the system at startup. See /etc/pf.conf for further details. $ sudo pfctl -e pf enabled

Then I installed a traffic inspector (an open source tool called mitmproxy) and fiddled with the flags until I figured out how to activate the transparent proxy mode.

$ ./mitmproxy -T -p 2300 --host

So this is basically simulates a man-in-the-middle attack to intercept the data. Note that this is only capturing the data sent and received by my phone, so it's not really an attack in the usual sense but just a way for me to view the data my phone is already dealing with.

I was delighted to see traffic being routed through my mitmproxy console when I visited websites. However, when I started up my Microsoft Health app, it wouldn't start at all (left screenshot below).

Eventually I figured out it was using HTTPS, which was running on a different port. So I made some changes. First, I installed an SSL certificate on my phone so that my phone would trust my computer which was intercepting the messages (right screenshot above). Then I added a line to my packet filter to also forward packets on the HTTPS port, by adding an extra line below to the .conf file and re-running the pfctl commmands.

rdr on en4 inet proto tcp to any port 443 -> 127.0.0.1 port 2300

Basically, instead of the Microsoft Health app communicating with the Microsoft server over HTTPS, all communication is routed through my computer. The mitmproxy tool intercepts the SSL keys and injects its own, so it can decrypt and re-encrypt messages that go through it.

At last, I was able to see the traffic from the Microsoft Health app. Fortunately, the requests were easy to figure out, and data was quite understandable.

Notice that to request the data, the phone issues a REST GET request to a URL like https://prodphseus.dns-cargo.com/v1/Events(eventId='1234567890')?$expand=Sequences,

If you are just interested in data from one event (like last night's sleep), then you would be satisfied at this point so you can just save the response from the Microsoft prodphseus.dns-cargo.com server and be happy. To get a few more events, you can simply click through every sleep (left screenshot), exercise (right screenshot), or other type of event until your phone (and your computer intercepting the messages) receives all the data. Then you save them to a file and you can view it in your favorite text editor and process it using a script.

But what if you don't want to manually go through every entry on your phone to have it transmit the data? Basically, to get the data over the entire time when you had the Band instead of individual events. Then recall the decompiled Java code at the beginning of this article containing:

odatarequest.addParameter("expand", "Sequences,Info");

which provides a clue that to retrieve the full set of data without having to select each entry on the phone, you could edit the URL to:

https://prodphseus.dns-cargo.com//v1/Events?$expand=Sequences,Info

Simply pasting the URL into a browser wouldn't work because you have to reuse the same authentication token the Microsoft Health app is using, but editing a prior request should let you retrieve your entire raw data stream without retrieving each event one by one.

To summarize how the Band works, some data is cached on the phone app while the rest is stored on Microsoft servers in the "cloud". By intercepting the phone app's requests to the server, you can download the raw data being sent or retrieve your entire historical data like heart-rate, gps, step count, etc. down to the minute level. Happy tracking!