Late last year, the NY Times released an article quoting a specialist working on the HealthCare.gov web site:

According to one specialist, the Web site contains about 500 million lines of software code. By comparison, a large bank’s computer system is typically about one-fifth that size.

This astronomically large number became the subject of intense criticism over the following months, especially in the wake of HealthCare.gov’s initially failed launch. Particularly, a number of software engineering experts brought into question how realistic it is for any software engineering team to even produce a code base that large. Despite this, the 500 million lines of code statistic has been uncritically cited worldwide.

Just today, a data visualization poking fun at this statistic made it to the front page of the subreddit /r/dataisbeautiful. Apparently annoyed by this horrendously false statistic for the last time, one programmer on the HealthCare.gov software development team decided to put the statistic to rest. This programmer performed an automated code count for the HealthCare.gov code base and estimated that there it has only about 3.7 million lines of code for the primary code base. Below is the breakdown of programming languages for that 3.7 million lines of code.

The programmer clarified:

this doesn’t include parts of the system used for administrative tasks.

and

the total number of lines of code controlling the entire system could be anywhere from 5 – 15 million lines of code.

So there you go — as many of us guessed all along, the 500 million lines of code statistic was utterly bogus. Let’s share this information and put that bad statistic to rest.