This blog post is also available as an IPython Notebook.

The FEC’s new API promises a whole new level of access to campaign finance data. (Image credit: 18F)

Big news in the campaign finance world: The Federal Election Commission (FEC) is taking a huge step forward by making data accessible through a modern API. With the help of a team of intrepid 18F developers, the FEC is rethinking both its website and its data offerings to better serve its mission of educating the public with real-time disclosure of campaign finance information. It’s part of the larger OpenFEC project, and we think it’s a very encouraging sign that this collaboration is going to improve access to a crucial information resource.

This is a beta release, but we’re really excited to see what’s been accomplished so far. What follows is meant as both an introduction to what’s available through this new resource and a critique of what’s working well, and the changes Sunlight would like to see in future releases of the API.

## Doesn’t the FEC already release data?

The FEC is a model disclosure authority: It has made federal campaign finance data available through a searchable web portal, in bulk CSV files, and, most impressively, a live feed of submitted disclosures. On [Influence Explorer](http://influenceexplorer.com/), we’ve made use of each of these sources in different ways — most recently turning that live feed into a searchable data resource, our [Real-Time Federal Campaign Finance tracker](http://realtime.influenceexplorer.com).

Sunlight has consistently called on government sources to make all data available in bulk. It’s difficult to know how a dataset might be used by a researcher, reporter, citizen or advocacy group; that’s why it’s important that government bodies release all of it in machine-readable bulk files to allow the fullest exploration of what’s available and to give context to any given data point. The FEC has historically set an excellent example in making bulk data available.

## Additional benefits from an API

We think pretty highly of what the FEC already offers, and encourage them to continue to make both bulk and streamed data available. Here at Sunlight, though, we tend to make the data we release available both in bulk and through APIs, because we think that APIs are the right kind of access for particular users and use-cases. So what additional advantages are offered by an API?

### Selective data views

Not every user or developer can effectively make use of bulk data. It typically doesn’t fit in a spreadsheet, so the point-and-click crowd can be at a loss right away. Even if you’re technically skilled enough to load it into R or Pandas, though, you may hit a barrier if the operations you want to carry out require that the data be loaded into memory.

How the FEC is making it easier to follow the money FEC Chair Ann Ravel recently discussed how the FEC is making changes so campaign finance information is more transparent. Read her guest post on Sunlight’s blog.

Furthermore, a bulk release may contain a lot of data that isn’t relevant to a particular use-case or investigation. Let’s say I want to look at contributions to House candidates who are Democrats in 2012. If I use the bulk release, I’m going to get a lot of data that’s not interesting to me, including all of the contributions to noncandidate PACs, members of other parties and contributions to presidential or senatorial candidates.

Pulling out exactly what you need usually requires loading everything into a database, setting up some indices and running queries. True, there are some tools for working directly with CSV files, like the excellent [csvkit](https://csvkit.readthedocs.org/en/0.9.1/), but depending on your query, you again might run into memory issues. Good old *nix standbys `grep`, `cut`, `sed` and `awk` can also get you pretty far, if you’re willing to hone your shell scripts.

With an API, though, you can hand off this selection business to the data’s source (in this case, the FEC’s data warehouse). As long as the API supports it, you can formulate your query and retrieve it with confidence. That’s an important qualifier, though: The onus is on the API’s design team to make sure that the views which are offered meet the needs of its likely users.

### Aggregate views

Another advantage to having an API is the ability to show up-to-date aggregations of the records in your database. This includes totals, rankings and metadata that might change over time.

Again, aggregation is something that anyone can perform on bulk data. That is, anyone with the technical know-how discussed in the last section and the domain knowledge needed to properly compute aggregates.

In the case of FEC data in particular, summing the dollar amounts of individual transactions can be deceptively difficult. Whether or not two amounts can be added together depends on the type of committee, the type of transaction and sometimes also the type of contributor.

A certain level of legal and regulatory expertise is needed when calculating these sums, but might be out of scope for a developer that wants to add or explore some summary statistics from campaign finance, but for whom campaign finance is not the main focus of investigation. Maybe the focus is on projected vote share, and the campaign finance information is going to be added for context. In cases like that, it’s useful to source the aggregate totals published by the FEC itself — and an API is a great way to deliver that data.

### Live data

In addition to being more focused and infused with expertise, data views made available through an API can be tied to live data. In the case of the FEC’s new API, the data is updated daily. This partially avoids the need for a developer to repeatedly update their database with successive bulk data releases.

In fact, for some use-cases, an API might obviate the need for a database altogether. Imagine again the case of a website that shows some other, non-campaign-finance data, such as legislative activity or election results. If campaign finance data could be a helpful addition to that kind of app, the developer can avoid having to build a big addition onto their database by making client-side calls with javascript. The site, which may be backed by a large database, can deliver data in a web app, and then obtain FEC’s aggregate totals or summary facts on-the-fly, allowing them to show up if and when the site’s designers choose.

—

## Brief tour

Here’s a very quick tour of what you can expect from the OpenFEC API.

### What’s available

The [official documentation](https://api.open.fec.gov/developers) for the API is the best source for getting to know what it has to offer, but here are some of the things you can look forward to interacting with.

1. **Search By Name**: Nearly every question that can be answered with FEC data requires knowing the unique identifiers that FEC has assigned to the entities involved. The search endpoints make acquiring these identifiers straightforward. 2. **Candidate and Committee Details**: Armed with the right identifiers, one can access a lot of important information about any candidate or committee, including location information, FEC designations and the entity’s history. 3. **Financial Reports**: For each committee, you can obtain the top-line numbers describing contributions, receipts, expenditures and loans. These are sourced from the committee’s periodically filed financial reports. These come complete with links to the PDF (shudder) of the original filing. 4. **Per-Cycle Committee Summaries**: In addition to the individual reports’ numbers, the API also makes available top-line numbers aggregated on a per-cycle basis.

In other words, FEC’s first API is off to a very promising start. Armed with just this data, there are already a lot of opportunities to keep up with campaign finance during the 2016 election.

### Our wish list

While we were excited to see the progress made so far, there’s a few things we’d really like to see added to the API.

1. **Endpoints for Itemized Data**: Having endpoints dedicated to the itemized transactions that show up on Schedule A (receipts), Schedule B (disbursements) and Schedule E (expenditures) should be the first priority for new additions the API. There is a tremendous amount of useful information that is contained in these line items. Without an endpoint for itemized transactions, the bulk data and live feeds still offer much that the API does not. *NOTE: Thanks to the OpenFEC team’s [in-the-open development](https://github.com/18F/openFEC) on Github, there’s [evidence](https://github.com/18F/openFEC/blob/2022c0a965f878e0c62521d321104a52c9e500e5/webservices/rest.py#L178-L194) that this is on the way!* 2. **Per-Contributor/Per-Recipient Aggregates**: Campaign finance data is essentially the description of the relationships between contributors and their recipients. Endpoints are needed that list (a) the per-contributor, per-cycle aggregate totals of receipts and (b) the per-recipient, per-cycle aggregate totals of disbursements. It’s unfortunate that, given the state of disclosure, these can only include PAC-to-candidate and PAC-to-PAC transactions, but they’re very useful nonetheless. 3. **Independent Expenditure Aggregates**: In a world strongly influenced by the behavior of independent-expenditure-only PACs (super PACs), it’s very important to be able to ask two questions: For a given super PAC, who have they spent the most money targeting negatively/positively?; and, for a given candidate, who has spent the most targeting them negatively/positively? This data is available from FEC, but isn’t an endpoint yet.

## Sunlight from the inside out

A quick note: The group of developers working on OpenFEC includes two former Sunlight labs members. We couldn’t be prouder of the work they’ve been doing during their time “on the inside.” It’s unsurprising, though, that they’ve been effective at 18F, and more specifically on the OpenFEC project. Lindsay Young developed our portal for accessing a live feed of filings related to the Foreign Agent Registration Act, and Alison Rowland was my predecessor as project lead on Influence Explorer. We miss them both, but we’re very grateful for the hard work they and their team are putting into improving public access to campaign finance disclosure at the federal level.

—

## Exploring the API

The base URL for the API is:

BASE_URL = ‘http://api.open.fec.gov/v1’

You’ll also need a Data.gov API key, which you can obtain [here](https://api.data.gov/signup/). I save my API keys in a plain text file in my home directory, so that they’re always handy and so that I can use them without revealing them in notebooks like this one:

API_KEY = open(os.path.expanduser(‘~/.api-keys/data.gov’),’r’).read().strip()

Conceptually, there are two main areas of focus for the API: candidates and committees. When looking at contributions, however, remember that recipients are always committees. Candidates do not receive contributions directly, their committees do. Here are the relevant branches:

– `/candidate`: individual candidate information – `/committee`: individual committee information

### Documentation

We’re going to cover a fair bit of ground in this introduction, but for more details on what’s possible, check the [official OpenFEC API documentation](https://api.open.fec.gov/developers).

## Helpful utils

Some methods and global vars to help us stay succinct are below:

def all_results(endpoint, params): _params = deepcopy(params) _params.update({‘api_key’: API_KEY}) _url = BASE_URL+endpoint logging.info(‘querying endpoint: {}’.format(_url))

initial_resp = requests.get(_url, params=_params)

logging.debug(‘full url eg: {}’.format(initial_resp.url))

initial_data = initial_resp.json()

num_pages = initial_data[‘pagination’][‘pages’] num_records = initial_data[‘pagination’][‘count’] logging.info(‘{p} pages to be retrieved, with {n} records’.format( p=num_pages, n=num_records))

current_page = initial_data[‘pagination’][‘page’] logging.debug(‘page {} retrieved’.format(current_page))

for record in initial_data[‘results’]: yield record

while current_page < num_pages: current_page += 1 _params.update({'page': current_page}) _data = requests.get(_url, params=_params).json() logging.debug('page {} retrieved'.format(current_page)) for record in _data['results']: yield record logging.info('all pages retrieved') def count_results(endpoint, params): _params = deepcopy(params) _params.update({'api_key': API_KEY}) _url = BASE_URL+endpoint _data = requests.get(_url, params=_params).json() return _data['pagination']['count'] ## FEC identifiers: The keys to all data To get data associated with a candidate or a committee, you need to know the identifier that FEC has assigned to that entity. In case you don't have those memorized, though, there are two ways to obtain the IDs that you need: You can search for them, or obtain optionally filtered lists. ### Searching Data on candidate and committee entities can be found using the search endpoints for each type: - `/candidates/search` - `/committees/search` Let's try looking for a candidate. q_obama = { 'q': 'obama', } [r for r in all_results('/candidates/search', q_obama)] Here's the result: [{u'active_through': 2000, u'candidate_id': u'H0IL01087', u'candidate_status': u'P', u'candidate_status_full': u'Statutory candidate in a prior cycle', u'cycles': [2000], u'district': u'01', u'election_years': [2000], u'incumbent_challenge': None, u'incumbent_challenge_full': u'Unknown', u'name': u'OBAMA, BARACK H', u'office': u'H', u'office_full': u'House', u'party': u'DEM', u'party_full': u'Democratic Party', u'principal_committees': [{u'candidate_ids': [u'H0IL01087'], u'committee_id': u'C00347583', u'committee_type': u'H', u'committee_type_full': u'House', u'cycles': [2000, 2002, 2004], u'designation': u'P', u'designation_full': u'Principal campaign committee', u'expire_date': None, u'first_file_date': None, u'last_file_date': u'2004-10-13T00:00:00+00:00', u'name': u'OBAMA FOR CONGRESS 2000', u'organization_type': None, u'organization_type_full': None, u'party': u'DEM', u'party_full': u'Democratic Party', u'state': u'IL', u'treasurer_name': u'LIONEL BOLIN'}], u'state': u'IL'}, {u'active_through': 2010, u'candidate_id': u'S4IL00180', u'candidate_status': u'C', u'candidate_status_full': u'Statutory candidate', u'cycles': [2004, 2006, 2008, 2010], u'district': None, u'election_years': [2004, 2010], u'incumbent_challenge': u'I', u'incumbent_challenge_full': u'Incumbent', u'name': u'OBAMA, BARACK', u'office': u'S', u'office_full': u'Senate', u'party': u'DEM', u'party_full': u'Democratic Party', u'principal_committees': [{u'candidate_ids': [u'S4IL00180'], u'committee_id': u'C00411934', u'committee_type': u'S', u'committee_type_full': u'Senate', u'cycles': [2006, 2008, 2010], u'designation': u'P', u'designation_full': u'Principal campaign committee', u'expire_date': u'2015-05-11T00:00:00+00:00', u'first_file_date': u'2005-05-25T00:00:00+00:00', u'last_file_date': u'2009-10-19T00:00:00+00:00', u'name': u'OBAMA 2010 INC', u'organization_type': None, u'organization_type_full': None, u'party': u'DEM', u'party_full': u'Democratic Party', u'state': u'IL', u'treasurer_name': u'HARVEY S WINEBERG'}, {u'candidate_ids': [u'S4IL00180'], u'committee_id': u'C00381442', u'committee_type': u'S', u'committee_type_full': u'Senate', u'cycles': [2002, 2004, 2006], u'designation': u'P', u'designation_full': u'Principal campaign committee', u'expire_date': None, u'first_file_date': u'2002-08-22T00:00:00+00:00', u'last_file_date': u'2005-08-05T00:00:00+00:00', u'name': u'OBAMA FOR ILLINOIS INC', u'organization_type': None, u'organization_type_full': None, u'party': u'DEM', u'party_full': u'Democratic Party', u'state': u'IL', u'treasurer_name': u'HARVEY S. WINEBERG'}], u'state': u'IL'}, {u'active_through': 2012, u'candidate_id': u'P80003338', u'candidate_status': u'C', u'candidate_status_full': u'Statutory candidate', u'cycles': [2008, 2010, 2012], u'district': None, u'election_years': [2008, 2012], u'incumbent_challenge': u'I', u'incumbent_challenge_full': u'Incumbent', u'name': u'OBAMA, BARACK', u'office': u'P', u'office_full': u'President', u'party': u'DEM', u'party_full': u'Democratic Party', u'principal_committees': [{u'candidate_ids': [u'P80003338'], u'committee_id': u'C00431445', u'committee_type': u'P', u'committee_type_full': u'Presidential', u'cycles': [2008, 2010, 2012, 2014, 2016], u'designation': u'P', u'designation_full': u'Principal campaign committee', u'expire_date': u'2015-05-11T00:00:00+00:00', u'first_file_date': u'2007-01-16T00:00:00+00:00', u'last_file_date': u'2013-01-31T00:00:00+00:00', u'name': u'OBAMA FOR AMERICA', u'organization_type': None, u'organization_type_full': None, u'party': u'DEM', u'party_full': u'Democratic Party', u'state': u'IL', u'treasurer_name': u'NESBITT, MARTIN H'}], u'state': u'US'}] Wait, there are three Barack Obamas? Well, not quite. The FEC assigns an identifier each time someone runs for a particular office. Obama has an FEC ID that starts with `P` because he ran for president, but also picked up two more when he ran for seats in the House (`H`) and Senate (`S`).

The FEC data doesn’t do any formal reconciliation of these records, so it’s something to look out for when you’re looking at someone’s history. For instance, if we were to use `P80003338` to look up Obama’s history using the `/candidate/{candidate_id}/history` endpoint, we might expect to see those other identifiers somewhere. Unfortunately, that’s not the case:

[r for r in all_results(‘/candidate/P80003338/history’, {})]

Here’s the result:

[{u’address_city’: u’CHICAGO’, u’address_state’: u’IL’, u’address_street_1′: u’PO BOX 8102′, u’address_street_2′: None, u’address_zip’: u’60680′, u’candidate_id’: u’P80003338′, u’candidate_inactive’: None, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’expire_date’: None, u’form_type’: u’F2Z’, u’incumbent_challenge’: u’I’, u’incumbent_challenge_full’: u’Incumbent’, u’load_date’: u’2015-05-11T12:15:43+00:00′, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’, u’two_year_period’: 2012}, {u’address_city’: u’CHICAGO’, u’address_state’: u’IL’, u’address_street_1′: u’PO BOX 8102′, u’address_street_2′: None, u’address_zip’: u’60680′, u’candidate_id’: u’P80003338′, u’candidate_inactive’: None, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’expire_date’: u’2015-05-11T00:00:00+00:00′, u’form_type’: u’F2′, u’incumbent_challenge’: u’O’, u’incumbent_challenge_full’: u’Open seat’, u’load_date’: u’2015-05-11T12:15:43+00:00′, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’, u’two_year_period’: 2010}, {u’address_city’: u’CHICAGO’, u’address_state’: u’IL’, u’address_street_1′: u’PO BOX 8102′, u’address_street_2′: None, u’address_zip’: u’60680′, u’candidate_id’: u’P80003338′, u’candidate_inactive’: None, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’expire_date’: u’2015-05-11T00:00:00+00:00′, u’form_type’: u’F2′, u’incumbent_challenge’: u’O’, u’incumbent_challenge_full’: u’Open seat’, u’load_date’: u’2015-05-11T12:15:43+00:00′, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’, u’two_year_period’: 2008}]

### Listing

We can also obtain a list of many candidates, applying optional filtering constraints if we don’t want the entire list. This can be done at the `/candidates` endpoint. The metadata in the records returned can help when building a local reference resource or lookup table.

q_all_2012_candidates = { “cycle”: 2012, }

This query is going to return quite a lot of candidates:

count_results(‘/candidates’, q_all_2012_candidates)

Here’s the result:

3024

You can limit the list by specifying the `candidate_status`. Most of the time, what we care about are candidates with `candidate_status=C`, which means they are a declared candidate who has raised at least $5,000 in that cycle.

q_all_2012_present_candidates = { “cycle”: 2012, “candidate_status”: “C” }

count_results(‘/candidates’, q_all_2012_present_candidates)

Here’s the result:

1885

It’s true that we’re looking at all federal races in 2012, but that’s still a pretty big number. Let’s pull that data down and see how it looks.

candidates_2012 = [c for c in all_results(‘/candidates’, q_all_2012_present_candidates)]

Picking one at “random”:

[c for c in candidates_2012 if ‘OBAMA’ in c[‘name’]]

Here’s the result:

[{u’active_through’: 2012, u’candidate_id’: u’P80003338′, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’incumbent_challenge’: u’I’, u’incumbent_challenge_full’: u’Incumbent’, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’}]

For ease of use and demonstration, let’s convert the results to a Pandas DataFrame:

candidates_2012_df = pd.DataFrame(candidates_2012) candidates_2012_df.head()

active_through candidate_id candidate_status candidate_status_full cycles district election_years incumbent_challenge incumbent_challenge_full name office office_full party party_full state 0 2012 S2UT00229 C Statutory candidate [2012] None [2012] C Challenger AALDERS, TIMOTHY NOEL S Senate REP Republican Party UT 1 2012 H2CA01110 C Statutory candidate [2012] 01 [2012] C Challenger AANESTAD, SAMUEL H House REP Republican Party CA 2 2012 H2AZ02279 C Statutory candidate [2012] 02 [2012] C Challenger ABOUD, PAULA ANN H House DEM Democratic Party AZ 3 2012 H2CA25176 C Statutory candidate [2012] 25 [2012] C Challenger ACOSTA, DANTE H House REP Republican Party CA 4 2014 H8NC03043 C Statutory candidate [2008, 2010, 2012, 2014] 03 [2008, 2014] C Challenger ADAME, MARSHALL RICHARD H House DEM Democratic Party NC

Since we had some high counts, let’s look at how they break down (note the log scale on the x axis).

candidates_2012_df.pivot_table( index=’party’, columns=’office’, values=’candidate_id’, aggfunc=np.size ).plot( kind=’barh’, subplots=True, figsize=(6,10), logx=True, legend=False, xticks=[1, 10, 100, 1000] )

![png](https://horseradish.s3.amazonaws.com/CACHE/images/photos/0a/30/faa7fc064388/candidate__count_by_office_and_party-800.png)

So while these numbers seem a bit higher than you might expect, they’re in the right proportion: Democrats and Republicans are the most common parties (at least among the congressional candidates), there are far more candidates for house than there are for senate and candidates for president make up the smallest population. Still, why are there so many more candidates than we remember seeing in 2012?

The answer is that, while mainstream election coverage typically focuses on candidates that are likely to be competitive and/or associated with a major national party, the FEC is responsible for reporting the campaign finance records for everyone who registers with the FEC as a candidate. As a result, it’s a much higher number than many people perceive.

### Focusing on select entities

Let’s look at the names of those candidates who raised more than $5000 in a bid for the oval office:

q_all_2012_present_prez_candidates = { “cycle”: 2012, “candidate_status”: “C”, “office”: “P”, }

count_results(‘/candidates’, q_all_2012_present_prez_candidates)

Here’s the result:

39

Now, we’ll build the DataFrame:

prez_candidates_2012 = [c for c in all_results(‘/candidates’, q_all_2012_present_prez_candidates)] prez_candidates_2012_df = pd.DataFrame(prez_candidates_2012) prez_candidates_2012_df[[‘name’,’party’,’candidate_id’]].sort(‘party’)

name party candidate_id 12 GOODE, VIRGIL H JR 999 P20004685 6 CARTER, WILLIE FELIX DEM P80000268 14 HERMAN, RAPHAEL DEM P20002184 23 OBAMA, BARACK DEM P80003338 27 RICHARDSON, DARCY G DEM P20001376 22 MESPLAY, KENT P GRE P40003279 33 STEIN, JILL GRE P20003984 11 FARNSWORTH, VERL IND P20002853 21 MCCALL, JAMES HATTON IND P80003361 25 RAKOWITZ, ARTHUR FABIAN IND P20003448 28 RISLEY, MICHEALENE CRISTINI IND P20004727 34 TERRY, RANDALL A IND P20002424 35 WELLS, ROBERT CARR JR IND P20004065 37 WIFORD, SAMUEL TIMOTHY II IND P20003489 13 HARRIS, RICHARDJASON SATAWK LIB P20003364 38 WRIGHTS, ROGER LEE LIB P20002952 5 BROWN, HARLEY D NNE P00004275 17 KOTLIKOFF, LAURENCE J NNE P20004511 26 REED, JILL ANN NNE P20003208 31 ROTH, CECIL JAMES NNE P20003836 1 ANDERSON, ROSS C (ROCKY) OTH P20004263 2 BARR, ROSEANNE CHERRI OTH P20002804 10 DURHAM, STEPHEN OTH P20004651 20 LOPEZ, CHRISTINA (VICE PRES) OTH P20004669 29 ROEMER, CHARLES E. ”BUDDY” III OTH P20002523 36 WHITE, JEROME S OTH P20004677 0 ADESHINA, YINKA ABOSEDE REP P60004793 3 BLANKENSHIP, JARED REP P20002598 7 CISNEROS, CESAR REP P20002390 8 DAVIS, L JOHN JR REP P20002325 9 DRUMMOND, KEITH REP P20003430 15 HILL, CHRISTOPHER V REP P20002838 16 KARGER, FRED REP P20002564 18 LAWSON, EDGAR A REP P20003950 24 PAUL, RON REP P80000748 30 ROMNEY / PAUL D. RYAN, MITT REP P80003353 4 BLOCK, JEFF UNK P20003398 19 LINDSAY, PETA UNK P20004636 32 SCHRINER, JOSEPH CHARLES UNK P00003962

Yep, that’s quite a large field. Keep this in mind when pulling your data: You’ll probably want to make editorial choices about which candidates you’d like to focus on. That could be as easy as filtering your results after obtaining them from the API:

candidates_to_focus_on = [‘PAUL, RON’, ‘OBAMA, BARACK’, ‘ROMNEY / PAUL D. RYAN, MITT’]

candidate_filter = prez_candidates_2012_df.name.str.match( ‘|’.join(candidates_to_focus_on), case=False)

prez_candidates_2012_df[candidate_filter].T

23 24 30 active_through 2012 2012 2012 candidate_id P80003338 P80000748 P80003353 candidate_status C C C candidate_status_full Statutory candidate Statutory candidate Statutory candidate cycles [2008, 2010, 2012] [1988, 1990, 1992, 1994, 1996, 1998, 2000, 200… [2008, 2010, 2012] district None None None election_years [2008, 2012] [1988, 1990, 2008, 2012] [2008, 2012] incumbent_challenge I C C incumbent_challenge_full Incumbent Challenger Challenger name OBAMA, BARACK PAUL, RON ROMNEY / PAUL D. RYAN, MITT office P P P office_full President President President party DEM REP REP party_full Democratic Party Republican Party Republican Party state US US US

If you plan to regularly update your data, though, you might want to store the identifiers for the entities you’re interested in and use those for future API calls.

q_my_2012_prez_candidates = { “cycle”: 2012, “candidate_status”: “C”, “office”: “P”, “candidate_id”: [‘P80003338’, ‘P80000748’, ‘P80003353’, ‘P20002523’, ‘P20003984’] }

my_2012_prez_candidates = [c for c in all_results(‘/candidates’, q_my_2012_prez_candidates)] my_2012_prez_candidates_df = pd.DataFrame(my_2012_prez_candidates) my_2012_prez_candidates_df.T

0 1 2 3 4 active_through 2012 2012 2012 2012 2012 candidate_id P80003338 P80000748 P20002523 P80003353 P20003984 candidate_status C C C C C candidate_status_full Statutory candidate Statutory candidate Statutory candidate Statutory candidate Statutory candidate cycles [2008, 2010, 2012] [1988, 1990, 1992, 1994, 1996, 1998, 2000, 200… [2012] [2008, 2010, 2012] [2012] district None None None None None election_years [2008, 2012] [1988, 1990, 2008, 2012] [2012] [2008, 2012] [2012] incumbent_challenge I C C C C incumbent_challenge_full Incumbent Challenger Challenger Challenger Challenger name OBAMA, BARACK PAUL, RON ROEMER, CHARLES E. ”BUDDY” III ROMNEY / PAUL D. RYAN, MITT STEIN, JILL office P P P P P office_full President President President President President party DEM REP OTH REP GRE party_full Democratic Party Republican Party Other Republican Party Green Party state US US US US US

## Using identifiers to obtain candidate data

If we want to know more about a given candidate, we have some options. Using the `candidate_id` field, we can make requests to the `/candidate` endpoint to get a detailed profile. Note that the identifier needs to be included as part of the path, not as a GET argument.

[r for r in all_results(‘/candidate/P80003338′,{})]

Here’s the result:

[{u’active_through’: 2012, u’address_city’: u’CHICAGO’, u’address_state’: u’IL’, u’address_street_1′: u’PO BOX 8102′, u’address_street_2′: None, u’address_zip’: u’60680′, u’candidate_id’: u’P80003338′, u’candidate_inactive’: None, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’expire_date’: None, u’form_type’: u’F2Z’, u’incumbent_challenge’: u’I’, u’incumbent_challenge_full’: u’Incumbent’, u’load_date’: u’2015-05-11T12:15:43+00:00′, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’}]

### Looking up candidate committees

Let’s continue to look at those presidential candidates. How much did each one raise in 2012? We can start to answer that question by looking at their committees, using the following endpoint:

/candidate/{candidate_id}/committees/history/{cycle}

Let’s look up the committees associated with Barack Obama.

count_results(‘/candidate/P80003338/committees’,{‘cycle’:2012})

Here’s the result:

21

Hm, that’s odd. He probably didn’t have 21 committees.

[r[‘name’] for r in all_results(‘/candidate/P80003338/committees’,{‘cycle’:2012})]

Here’s the result:

[u’ALASKAN WOMEN FOR OBAMA’, u’CALIFORNIANS FOR CHANGE’, u’COALITION FOR CHANGE’, u’DC LGBT FOR SECOND TERM’, u’OBAMA – COMMITTEE TO ELECT’, u’OBAMA FOR AMERICA’, u’OBAMA VICTORY FUND’, u’OBAMA VICTORY FUND 2012′, u’PA MOVING FORWARD’, u’REALISTIC AND TRUTHFUL’, u’SUPPORT THE PREZ’, u’SWING STATE VICTORY FUND’, u’WNC FOR CHANGE’, u’YES WE CAN NEBRASKA’]

What’s happening here is that the API is returning all committees that claim to be associated with Obama. Some do so because they intended to raise money specifically for him, and others are “Single Candidate Independent Expenditure” groups. Most, though, are of designation “Unauthorized”.

[(r[‘designation_full’],r[‘committee_type_full’]) for r in all_results(‘/candidate/P80003338/committees’,{‘cycle’:2012})]

Here’s the result:

[(u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’PAC – Nonqualified’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Principal campaign committee’, u’Presidential’), (u’Joint fundraising committee’, u’PAC – Nonqualified’), (u’Joint fundraising committee’, u’PAC – Nonqualified’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Joint fundraising committee’, u’PAC – Nonqualified’), (u’Unauthorized’, u’PAC – Nonqualified’), (u’Unauthorized’, u’PAC – Nonqualified’)]

For now, let’s focus on Obama’s principal campaign committee. We can limit the results using the `designation` field and the `committee_type` field:

[r for r in all_results(‘/candidate/P80003338/committees’, {‘cycle’:2012, ‘designation’: ‘P’, ‘committee_type’: ‘P’})]

Here’s the result:

[{u’candidate_ids’: [u’P80003338′], u’city’: u’CHICAGO’, u’committee_id’: u’C00431445′, u’committee_type’: u’P’, u’committee_type_full’: u’Presidential’, u’custodian_city’: None, u’custodian_name_1′: None, u’custodian_name_2′: None, u’custodian_name_full’: None, u’custodian_name_middle’: None, u’custodian_name_prefix’: None, u’custodian_name_suffix’: None, u’custodian_name_title’: None, u’custodian_phone’: None, u’custodian_state’: None, u’custodian_street_1′: None, u’custodian_street_2′: None, u’custodian_zip’: None, u’cycles’: [2008, 2010, 2012, 2014, 2016], u’designation’: u’P’, u’designation_full’: u’Principal campaign committee’, u’email’: u’OFAFEC@BARACKOBAMA.COM’, u’expire_date’: u’2015-05-11T00:00:00+00:00′, u’fax’: None, u’filing_frequency’: u’Q’, u’first_file_date’: u’2007-01-16T00:00:00+00:00′, u’form_type’: u’F1Z’, u’last_file_date’: u’2013-01-31T00:00:00+00:00′, u’leadership_pac’: None, u’load_date’: u’2015-05-11T12:36:16+00:00′, u’lobbyist_registrant_pac’: None, u’name’: u’OBAMA FOR AMERICA’, u’organization_type’: None, u’organization_type_full’: None, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’party_type’: None, u’party_type_full’: None, u’qualifying_date’: None, u’state’: u’IL’, u’state_full’: u’Illinois ‘, u’street_1′: u’PO BOX 8102′, u’street_2′: None, u’treasurer_city’: u’CHICAGO’, u’treasurer_name’: u’NESBITT, MARTIN H’, u’treasurer_name_1′: None, u’treasurer_name_2′: None, u’treasurer_name_middle’: None, u’treasurer_name_prefix’: None, u’treasurer_name_suffix’: None, u’treasurer_name_title’: u’TREASURER’, u’treasurer_phone’: u’3129851700′, u’treasurer_state’: u’IL’, u’treasurer_street_1′: u’PO BOX 8102′, u’treasurer_street_2′: None, u’treasurer_zip’: u’60680′, u’website’: u’HTTP://WWW.BARACKOBAMA.COM’, u’zip’: u’60680′}]

We’ll have to combine multiple API calls to get everyone we care about.

my_2012_prez_committees = []

for i, row in my_2012_prez_candidates_df.iterrows(): endpoint = ‘/candidate/{c}/committees’.format(c=row.candidate_id) for res in all_results(endpoint, {‘cycle’:2012, ‘designation’: ‘P’, ‘committee_type’: ‘P’}): res[‘candidate_id’] = row.candidate_id my_2012_prez_committees.append(res)

my_2012_prez_committees_df = pd.DataFrame(my_2012_prez_committees) my_2012_prez_committees_df[[‘name’,’committee_id’,’candidate_id’]]

name committee_id candidate_id 0 OBAMA FOR AMERICA C00431445 P80003338 1 RON PAUL 2012 PRESIDENTIAL CAMPAIGN COMMITTEE … C00495820 P80000748 2 BUDDY ROEMER FOR PRESIDENT, INC. C00493692 P20002523 3 ROMNEY FOR PRESIDENT, INC. C00431171 P80003353 4 JILL STEIN FOR PRESIDENT C00505800 P20003984

### Obtaining committee summaries

Now that we have identifiers for the primary campaign committees associated with each candidate, we can obtain some interesting summary information about them. There are two different endpoints for getting financial information:

– `/committee/{committee_id}/totals` (straightforward cycle-wide totals) – `/committee/{committee_id}/reports` (actual reports submitted — advanced content!)

Let’s look at the more straightforward totals endpoint:

my_2012_prez_committee_totals = []

for i, row in my_2012_prez_committees_df.iterrows(): endpoint = ‘/committee/{c}/totals’.format(c=row.committee_id) for res in all_results(endpoint, {‘cycle’:2012}): my_2012_prez_committee_totals.append(res)

my_2012_prez_committee_totals_df = pd.DataFrame(my_2012_prez_committee_totals) my_2012_prez_committee_totals_df[[‘committee_id’,’contributions’,’disbursements’,’receipts’,]]

committee_id contributions disbursements receipts 0 C00431445 549594250 737507855 738503770 1 C00495820 39928730 39968390 41060317 2 C00493692 400036 739453 780900 3 C00431171 304959168 483073478 483452331 4 C00505800 819034 1122027 1263540

Merging these facts together with the metadata that we’ve already collected, we can start to produce some good comparisons:

comparison = my_2012_prez_committees_df.set_index(‘committee_id’).join( my_2012_prez_committee_totals_df.set_index(‘committee_id’), rsuffix=’.cmte’)

comparison = comparison.set_index(‘candidate_id’).join( my_2012_prez_candidates_df.set_index(‘candidate_id’), rsuffix=’.cand’)

comparison.set_index(‘name.cand’)[[‘disbursements’,’receipts’,]].plot(kind=’barh’)

![png](https://horseradish.s3.amazonaws.com/CACHE/images/photos/d8/d6/302006cb41f2/committee__total_disbursements_and_reciepts-800.png)

comparison.set_index(‘name.cand’)[ [‘individual_itemized_contributions’, ‘individual_unitemized_contributions’, ‘transfers_from_affiliated_committee’, ‘other_political_committee_contributions’, ‘candidate_contribution’ ] ]

individual_itemized_contributions individual_unitemized_contributions transfers_from_affiliated_committee other_political_committee_contributions candidate_contribution name.cand OBAMA, BARACK 315170951 234409690 181700000 0 5000 PAUL, RON 21916605 18009455 1000500 2670 0 ROEMER, CHARLES E. ”BUDDY” III 374937 0 0 0 25100 ROMNEY / PAUL D. RYAN, MITT 103245581 25499257 146516071 1126219 0 STEIN, JILL 386655 427592 0 1786 0

comparison.set_index(‘name.cand’)[ [ ‘individual_itemized_contributions’, ‘individual_unitemized_contributions’, ‘transfers_from_affiliated_committee’, ‘other_political_committee_contributions’, ] ].plot(kind=’barh’, stacked=True, figsize=(10,10))

![png](https://horseradish.s3.amazonaws.com/CACHE/images/photos/cc/92/4b231faf4f90/committee__other_totals-800.png)

—

So, there you have it: a brief rundown of the OpenFEC API and some quick pointers on how to use it. The FEC making data available through an API is an encouraging step forward, and though it could use some improvements, we’re excited to see the FEC making positive changes to better educate the public on campaign finance in America.