Published by la rédaction

Cracked Labs, an independent Austrian institute, published a vast enquiry on digital data trading. An enlightening picture.

Edward Snowden yet keeps warning you. Each time you click on a website from your computer or your smartphone, you trigger a great variety of hidden data sharing mechanisms distributed across several companies. Digital tracking and profiling, combined with personalization, are not only used to monitor, but also to influence people’s behavior. The study Corporate Surveillance in Everyday Life published by the Viennese institute Cracked Labs on June 8 is 93 pages long and comprehensive. The independent body propose a focus on the hidden part of an iceberg, very useful to those who are getting worried about data trading and inform of privacy breaches. The study was lead by Wolfie Christl, already author in 2016 of Network of Control, who extends and completes his research here. He particularly examines practices and the internal operations of the companies involved, hidden flows, and shines light on the large actors in the sector. Focus on some of his key conclusions.

“In 2007, Apple introduced the smartphone, Facebook reached 30 million users, and companies in online advertising started targeting ads to Internet users based on data about their individual preferences and interests. Ten years later, a vast landscape of data companies has emerged on the deregulated and juicy market of data trading.

“Today, large online platforms, digital advertising companies, data brokers, and businesses in many sectors can now identify, sort, categorize, assess, rate, and rank consumers across platforms and devices. Many aspects of someone’s personality can be inferred from data on web searches, browsing histories, video viewing behaviors, social media activities, or purchases. (…)

“Sensitive personal attributes such as ethnicity, religious and political views, relationship status, sexual orientation, and alcohol, cigarette, and drug use can be quite accurately inferred from someone’s Facebook likes. Analysis of social network profiles can also predict personality traits such as emotional stability, life satisfaction, impulsivity, depression and sensationalist interest.

The data trading constellation has greatly extended beyond online platforms. © Cracked Labs CC-by-SA 4.0

Finance, insurance, health

“Beyond the large actors such as Facebook and Google, thousands of other companies from diverse industries continually share and exchange digital profiles, combine and link data from the Web and smartphones to client data and off line information accumulated over decades. Alarming fact, data on people’s behaviors, on social relations and most private moments are more and more applied in contexts or purposes completely different from the ones for which it was recorded. It is even used more and more to take automated decisions on people in crucial areas of life such as finance, insurance and health services.

“The study reveals that startups such as Lenddo, Kreditech, Cignifi and ZestFinance already utilize data from social media, web searches, or mobile phones to calculate someone’s creditworthiness without actually using data related to financial transactions. (…) Concerning insurance and health, the study gives the example of the large insurer Aviva, in cooperation with the consulting firm Deloitte, that has predicted individual health risks, such as for diabetes, cancer, high blood pressure and depression, for 60,000 insurance applicants based on consumer data traditionally used for marketing that it had purchased from a data broker. (…) Other example with the health analytics company GNS Healthcare that also calculates individual health risks for patients from a wide range of data such as genomics, medical records, lab data, mobile health devices, and consumer behavior.

Different levels, sectors and sources of consumer data collection. © Cracked Labs CC-by-SA 4.0

Facebook and Oracle, giants of the sector

“Facebook data is profitable since the social network giant uses no less than 52,000 personal attributes to sort and categorize its 1.9 billion users. In order to do so, the platform analyzes posts, likes, shares, friends, photos, movements, and many other kinds of behaviors. In addition, Facebook acquires data on its users from other companies. In 2013, the platform began its partnership with the four data brokers Acxiom, Epsilon, Datalogix and BlueKai.

“By acquiring several data brokers such as Datalogix, BlueKai, AddThis and Crosswise, Oracle, already one of the world’s largest software and databases vendors, has recently become one of the largest consumer data brokers. In its data cloud, Oracle aggregates 3 billion user profiles from 15 million different websites, data from 1 billion mobile users, billions of purchases from grocery chains and 1,500 large retailers, as well as 700 million messages from social networks, blogs and consumer reviews sites per day.

Oracle and its data suppliers, partners and services in April-May 2017. © Cracked Labs CC-by-SA 4.0

Mass personalization

“The daily movements of consumers on the Web are also monitored in real time. Not a single movement going through a web interface is overlooked: from searches on Google to those on Google Maps, from transports on Booking to meal orders on Deliveroo, are exploited in the transactions of data brokers. Mapping and datavisualisation software is designed to assess an individual’s consumer data, be it from a distant past or recent actions. Programming tools are also blooming on the sector: it’s all about inundating you with consumption suggestions presented as being able to make your choices and life ‘easier’. (…)

“Regarding this mass personalization, the study presents the case of the data company Rocket Fuel that promises its clients to ‘bring together trillions of digital and real-world signals to create individual profiles and deliver personalized, always-on, always-relevant experiences to the consumer,’ based on 2.7 billion unique profiles in its data store. RocketFuel sells its ‘predisposition to influence the consumer.’ (…)

How companies identify people and link information relating to them. © Cracked Labs CC-by-SA 4.0

Cross-referencing professional and personal data

“In 2012, Facebook started allowing companies to upload their own lists of email addresses and phone numbers to the platform. Although these addresses and numbers are converted into pseudonymous codes, Facebook can directly link this customer data from other companies with Facebook user accounts. In this way, companies can, for example, find and target exactly those persons on Facebook that they have email addresses or phone numbers on. This feature allows companies to systematically connect their own customer data with Facebook’s data and potentially keep a close eye on their personnel. Moreover, it also allows other advertising and data vendors to synchronize with the platform’s databases and tap into its capacities, essentially providing a kind of real-time remote control for Facebook’s data universe.

“Companies can now capture highly specific behavioral data, such as a click on a website, a swipe in a mobile app or a purchase in a store, in real-time, and tell Facebook to immediately find and target the persons who performed these activities. Google and Twitter launched similar features in 2015.

Prediction of personality indicators from Facebook likes. © Cracked Labs CC-by-SA 4.0

The fraud detection market

“In addition to the real-time surveillance machine that has been developed within online advertising, another forms of pervasive tracking and profiling has emerged in the fields of risk analytics, fraud detection and cyber security. For example, the cyber security firm ThreatMetrix processes data on 1.4 billion ‘unique user accounts’ across ‘thousands of global websites.’ Its Digital Identity Network captures ‘millions of daily consumer transactions including logins, payments and new account originations,’ and maps the ‘ever-changing associations between people and their devices, locations, account credentials, and behavior’ for identity verification and fraud prevention purposes. Its clients include Netflix, Visa, and firms in fields such as gaming, government services, and healthcare.

“Particularly intriguing in this year 2017 is the example of Google that introduced an invisible version of reCaptcha that until now asked you to click on ‘I am not a robot.’ From now on ‘human users will be let through’ without any user interaction in contrast to ‘suspicious ones and bots.’ The company doesn’t disclose which kinds of user data and behaviors it uses to identify humans. Investigations suggest that Google doesn’t only use IP addresses, browser fingerprints, the way user’s type, move their mouse, or use their touchscreen ‘before, during, and after’ a reCaptcha interaction, but also several of Google’s cookies. It is not clear whether people without user accounts face a disadvantage, whether Google is able to identify specific individuals rather than only ‘humans’ or whether Google also uses the data recorded within reCaptcha for purposes other than for bot detection. (…)”

Defense of civil rights

In the light of a multitude of examples, this report finds that even though one is more and more familiar with the world of data trading, large parts remain in the dark. “The application of transparency on practices regarding data for companies remains an essential condition to solve the massive data asymmetries between data companies and private individuals, concludes Wolfie Christl. I hope the results of this report will encourage researchers, journalists and other people in the sectors of civil rights, data protection, consumer protection.”

“Corporate Surveillance in Everyday Life”, Cracked Labs (Wolfie Christl, with Katharina Kopp, Patrick Urs Riechert and illustrations from Pascale Osterwalder), June 2017