Part I

“man writing on paper” by Helloquence on Unsplash

The 13 Government Plans

One of the first things an external analyst may observe about the Brazilian Presidential debate is that the policies and rules currently in place around the use of media are such that only a small number of candidates actually have space in mainstream media communication and the large majority of the population don't even know who all the candidates are, let alone understand what are their programs to make Brazil a better country.

Without any judgment about the rules in place, from a data scientist standpoint one would immediately classify this as bias. And to remove such bias from this analysis we've decided to start taking a closer look at each of the 13 Government Plans presented by the registered candidates to this election. Here is the list:

The 13 Government Programs for 2018 Brazilian Presidential Elections

To perform our analysis of the above government programs we've selected specific NLP (Natural Language Programming) techniques to be able to extract some valuable insights. We have then created an interactive visualization to present such insights using D3.js libraries.

Defining the analysis dimensions

To dig into the government plans, analyze them and provide an insightful comparison we've defined the following dimensions of our analysis:

policies: what are the themes addressed by the candidates in their government plans

what are the themes addressed by the candidates in their government plans weight : how important is each of the policies in the overall government plan presented by each candidate?

: how important is each of the policies in the overall government plan presented by each candidate? ideological position: what is the ideological position of each candidate for each of the policies above?

Policies

For this step we’ve reviewed a number of publications that attempted to create a taxonomy for the themes addressed by the 13 Government Plans for the Brazilian 2018 Presedential Elections. Our strategy was guided by the following principles:

have a common list of themes that could be used across all programs to allow comparisons

have a sufficiently long list to cover most of the themes addressed by the government plans

have a relatively short list in order to simplify the reader analysis

After some investigations we’ve settled on the following list of themes to represent the policy dimensions presented across all 13 Government Plans

1) Economy and Employment

2) Education and Health

3) External Policy and Environment

4) Political System and Corruption

5) Social Policy and Human Rights

6) Safety

Weight

One thing we wanted to compare the 13 government plans on was how much focus the program places on a specific policy dimension. This is a simple metric but can be indicative of the focus of the government program presented.

For this metric we've used a simple word count normalized to the total number of words of the government plan. The hypothesis behind this choice was that in a 100 words program, a 50 words theme would be considered more important then a 10 words theme by the authors of the government plan.

Ideological position

Most of the work for this phase of the project went in extracting and representing the ideological position of each program for each of the policy dimensions.

While people use regularly classifications such as "left", "right", "center" in conversations about political parties or candidates, is easy to recognize how this classification is very vague and subjective and it means different things to different people (sometimes very different things in different countries). We wanted to adopt a data-driven, objective measure of the parties positions expressed in their Government Programs.

Our research in the field of data science applied to evaluation of political party programs led to an important study carried in 2006 by Jonathan B. Slapin and Sven-Oliver Proksh as part of their PhD work at the Department of Political Science at UCLA. Their work attempted to create an analytical method to be able to extract a single metric to represent the ideological party position from text.

The methodology proposed uses a Poisson scaling technique to estimate party positions in a single left-right dimension based on word frequencies in political text. The main advantages of such method are that

It is language independent : as such it can be applied to any language with the same efficacy

: as such it can be applied to any language with the same efficacy It is bias-free: unlike other methodologies (also analysed in their papers) it doesn't require hand-coding (i.e. providing sample text that would "define" what's left and what's right in a political text).

To validate their approach the authors applied the technique to estimate the positions of German political parties from 1990 to 2005 and demonstrated how their approach was more accurate at reflecting parties political position shift than other time-series techniques. You can read their paper here.

In order to apply the technique presented in the paper we've broken the Government plans in the 6 policy dimensions above and then applied the technique for each policy. This way we were able to extract the ideology position represented in the political plans for each of the themes and used this variable in the horizontal axis of our representation.

It is important to highlight that the technique adopted does not attempt to provide an absolute classification of "left" or "right" but rather to measure the relative position of the different parties (government programs in our case) as emerging from the political text.