Macron & Le Pen vs Facebook Algorithm, analysis with https://facebook.tracking.exposed

Summary

During the last stage of the French Presidential Election 2017, we monitored Facebook through four ad-hoc profiles. We were collecting what they saw on their timeline, in order to understand how the Facebook algorithm can influence their perception of reality, by privileging some content and disregarding others.

Despite the fact that our users’ behavior gave Facebook the possibility to inform them impartially, we highlighted that the social network tends to exclude some contents from the Newsfeeds.

We are collecting evidences and methodology to understand if we can speak of algorithmic censorship, as it is one of the concerns we address with https://facebook.tracking.exposed

In particular, this research made evident the falsehood of statements like “if you don’t like Facebook don’t use it”. The society will be heavily influenced also is if you can permit yourself the luxury of staying outside the advertisement platform also knows as social media.

This is not a conclusive research, we are releasing our data-sets to let other data scientist validate our theories, and we are hereby announcing a public call to join a collaborative analysis for the upcoming UK 2017 elections.

Introduction to the platform

“facebook tracking exposed” is the project, #fbtrex the nickname and https://facebook.tracking.exposed is the actual website

Our mission is to help researchers assess how current filtering mechanisms work and how personalization algorithms should be modified in order to minimize the dangerous social effects of which they are indirectly responsible and to maximize the values, both individual and social, that algorithms should incorporate.

The data used in this analysis

Facebook Tracking Exposed is a browser extension able to copy public posts that Facebook shows to the user who install it. Only the Newsfeed is considered. The extension provides us (and to some extend to the public, after anonymization and minimization) with:

1, Facebook user’s ID;

2, The date and time for each time the user refreshes the Newsfeed;

3, The date and time of the impression of every post on user’s Newsfeed;

4, The position of every post in user’s Newsfeed;

5, The HTML section of every Public post in user’s Newsfeed.

See more at https://facebook.tracking.exposed/privacy-statement

The point 5 represents the juicy part that we are going to use for our analysis. At the beginning, it is just a raw piece of HTML, a chain of parsers to extracts metadata (source, publication time, text, type of media) to be used in analysis like this one. As much as the parser improve, also investigation like this might benefit (for example, get the number of likes collected at a certain time from a post)

The methodology

The first step of the algorithm audit is to uniform the behavior of your subjects. We created new profiles, of supposedly French persons, assuming they were citizens, the days before the election.

All of our users where following 13 mainstream french pages. We want use the publications of these media as reference. In this way we are able to see if some posts appears in less appealing location (despite the chronological order, lower in the timeline). Below the points we kept in mind to uniform the test:

Users put “like” (and therefore follow, publications appears in the timeline), these 13 sources here,←, on the left. Users have no friends Users do not put any like or comment to the posts in their feed Users scrolls in a browser for the same amount of time, in order to get more or less the same amount of posts. To do this, we used a dedicated browser for every user, with the extension installed, and a auto-scroll script (3 minutes scrolling, 1 page every 6 seconds, and after an hour, refresh and restart)

Until this, users have the same behavior and the news appears equally.

Our analysis has been drove by this research question:

� With just a small set of differences in the users profile, how much different can be the reality they perceive on Facebook?

After the common initialization, any user, has ~ 30 likes. They can be given to pages, community, random events. For example, these are the likes a user has done at the beginning of the initialization phase of the experiment. Below, for example, the user named Almìr liked these content: