We are interested in obtaining publishing the “open access” Social Security Death Master File (aka Death Index) — i.e. the one not covering people who died w/in the last 3 years.

It is extremely useful for genealogical and medical research, preventing fraud, etc.

The fields are simple: name (first, middle, last, & suffix); date of birth & death; SSN; validation (Verified: Report verified with a family member or someone acting on behalf of the family / Proof: Death certificate etc observed by SSA); record update date; and record update code (add/change/delete).

Somehow, SSA has never actually made it available. Instead, it only sells the “limited access” DMF through NTIS, at a price of $2.3k + $3.4k/yr.* (The “limited access” version is restricted, and in return it includes people who died in the last 3 years.)

SSA’s latest price estimate to us to provide the “open access” DMF is $5.2k (down from $8.2k).**

Think this is a rather un-“open” approach to providing something that Congress required to be publicly available. We would like to start litigation — but also to also pre-pay the requested $5.2k as a surety, so that they cough up the database now and we have a non-hypothetical fee charging to litigate. It would be very interesting to see, for instance, how exactly they spend 150 hours “searching” for a single database file.

We will make all received information publicly available, for free, both as flat files and through a Google BigQuery database. If we win the litigation, and get a ruling that it has to be made available for free, we will also make our hosted version continuously updated by simply having a regular re-FOIA of the deltas. Otherwise, we’ll do so if we can get the cost of obtaining it funded.

We already have the full 2014 SSA DMF available for free (we pay the transfer costs of ~2–15¢/GB):

Flat files https://console.cloud.google.com/storage/browser/fiat-fiendum/ssdmf / <gs://fiat-fiendum/ssdmf> BigQuery database fiat-fiendum:ssdmf https://console.cloud.google.com/bigquery?p=fiat-fiendum&d=ssdmf&page=dataset

Our hosted version has been cleaned up to correct various technical errors & inconsistencies in the raw data (and add convenience fields like “lifespan”), and our raw file is pipe delimited to make bulk processing easier.