Since the beginning of the media frenzy over CarrierIQ, I have repeatedly stated that based on my knowledge of the software, claims that keystrokes, SMS bodies, email bodies, and other data of this nature are being collected are erroneous. I have also stated that to satisfy users, it’s important that there be increased visibility into what data is actually being collected on these devices. This post represents my findings on how CarrierIQ works, and what data it is capable of collecting.



CarrierIQ Architecture Overview

There has been a lot of misinformation about which parties are responsible for which aspects of data collection. At a high level, CarrierIQ is a piece of software installed on phones that accepts pieces of information known as metrics. On receiving a submitted metric, CIQ evaluates whether that metric is “interesting” based on the current profile installed on the device. Profiles dictate whether or not a piece of information is relevant for assessing a particular aspect of phone service, such as reception or battery usage. These profiles are written by CarrierIQ at the request of cell phone carriers.

Note that the CarrierIQ application simply receives these metrics, collects them, and eventually uploads them to be analyzed by carriers. All of the code responsible for determining which metrics are submitted to CIQ for processing is integrated into the phone’s application stack by the handset manufacturers themselves.

To get a complete picture of this, suppose a carrier decides it wants to know about dropped calls. The handset manufacturers who produce phones supported by that carrier instrument the application code such that a metric is submitted to the CarrierIQ application when a call is dropped. When the CIQ application receives this metric, it evaluates whether or not to actually record this data and send it to the carrier based on the profile installed on the device.

What Metrics are Available?

I have completed an analysis of a deployment of CarrierIQ on the Samsung Epic 4G Touch. In this analysis, I enumerated every CarrierIQ-related hook integrated into the Android framework and examined what metrics can possibly be collected, and just as importantly, in what situations. This list does not include metrics that may be submitted by the baseband, which include additional radio and telephony information. The following table represents my findings:

Metric ID Metric Data Sent Situation AL34, AL35, AL36 Browser page render event Internal page ID, no data related to page contents or URL Page renders LC18, LC30 Location event GPS and non-GPS location data Location changes, telephony-related events NT0F, NT10 HTTP event Request type, content length, local port, status code, URL, no page contents HTTP request sent or response received NT07 Network event Internal identifier Network state changes DO3M, GS18, GS19, GS46, GS47, GS6E, RF02, RF04, RF05, RF1A, RF55 Telephony/radio events Misc. radio and telephony data Call dropped, service issues, radio event, etc. HW03, HW10, HW11 Hardware event Battery level, voltage, temperature, etc. Hardware state change UI01 Keystroke event Keycode Key pressed in phone dialer only UI08, UI09 Miscellaneous GUI events Network type, battery state GUI state changes GS01, GS02, GS03, LO03 Call event CallerID, state, phone number Call initiated, received, or failed UI13, UI15, UI19 Application event Application name New app, app stopped, app gained/lost focus QU04, QU05 Questionnaire event Question data Questionnaire completed MG01, MG02 SMS event Message length, phone number, status, no message body SMS received or sent

Interpreting These Findings

There are a number of important conclusions that can be drawn from this information:

1. CarrierIQ cannot record SMS text bodies, web page contents, or email content even if carriers and handset manufacturers wished to abuse it to do so. There is simply no metric that contains this information. 2. CarrierIQ (on this particular phone) can record which dialer buttons are pressed, in order to determine the destination of a phone call. I’m not a lawyer, but I would expect cell carriers already have legal access to this information. 3. CarrierIQ (on this particular phone) cannot record any other keystrokes besides those that occur using the dialer. 4. CarrierIQ can report GPS location data in some situations. 5. CarrierIQ can record the URLs that are being visited (including for HTTPS resources), but not the contents of those pages or other HTTP data.

One important thing to note is that this represents the metrics that are submitted to the CarrierIQ application by the code written by Samsung. The list of available metrics are carrier specific, but will remain constant on a given handset model. The subset of this data that is actually recorded and collected is at the discretion of the carrier, and is based on the profile installed on the device.

Edit: There have been comments made about use of the word “cannot” versus “does not”. I am using the word “cannot” literally, as in “is not capable of, in the present tense, without being altered by modifying its code and installing a new version on the phone”. It seems obvious to me that CarrierIQ could be modified in the future to perform nefarious actions: so could any application on your phone. Keep in mind CIQ is integrated by the OEM and to my knowledge has never been modified after installation, except in terms of profiles, which simply dictate which subset of available metrics defined by the OEM are collected.

Why Do They Gather This Data?

Taking this information into account, all of the data that is potentially being collected supports CarrierIQ’s claims that its data is used for diagnosing and fixing network, application, and hardware failures. Every metric in the above table has potential benefits for improving the user experience on a cell phone network. If carriers want to improve coverage, they need to know when and where calls are dropped. If handset manufacturers want to improve battery life on phones, knowledge of which applications consume the most battery life is essential. Consumers will have their own opinions about whether the collection of this data falls under the terms set by service agreements, but it’s clear to me that the intent behind its collection is not only benign, but for the purposes of helping the user.

Conclusions

Based on my research, CarrierIQ implements a potentially valuable service designed to help improve user experience on cellular networks. However, I want to make it clear that just because I do not see any evidence of evil intentions does not mean that what’s happening here is necessarily right. I believe the following points need to be addressed. Note that most of the burden in this situation falls not on CarrierIQ but on the handset manufacturers and carriers, who are ultimately responsible for both collecting this information and establishing service agreements with consumers.

1. Consumers need to be able to opt out of any sort of data collection. This option would need to be provided by carriers and handset manufacturers. 2. There needs to be more transparency on the part of carriers in terms of what data is being collected from users. 3. There needs to be third-party oversight on what data is collected to prevent abuse. 4. The verbose debugging logs demonstrated in Trevor Eckhart’s video are a risk to privacy, and should be corrected by HTC (the author of the responsible code) by disabling these debugging messages. 5. The legality of gathering full URLs with query parameters and other data of this nature should be examined.

Footnote: Neither I nor my employer (VSR) have ever had a professional relationship with CarrierIQ, handset manufacturers, or cellular providers. This research was conducted independently by me.

Edit: In the interest of full disclosure, after completing this research, I provided it to CarrierIQ, who confirmed my technical findings.