Green Room

C’mon, the ObamaCare website doesn’t have 500 million lines of code

posted at 2:22 pm on October 24, 2013 by

This week is all about debunking ObamaCare-related myths (e.g., “No, really, it’ll work”), so here’s HA’s own resident tech wizard, WordPress expert Mark Jaquith, on the claim that Healthcare.gov was written with 500 million lines of code. Preposterous, says Mark, and it’s equally preposterous that the NYT would have quoted a source to that effect without checking to see if it was plausible.

FYI, that “500 million lines of code” thing comes from an anonymous NYT source who isn’t even claimed to be someone who worked on the project or who had access to the source code.

“Lines of code” is a crappy metric to start with… for example, this is one line of code:

if ( user_is_registered() ) { print “Logged in”; }

This is 6 lines of code:

# Print a notice if the user is logged in
# Note: the user check is cached from earlier
if ( user_is_registered() )
{
print “Logged in”;
}

But they’re both the same code.

But still, even allowing for the vagueness implicit in that type of measurement, 500 million lines of code is insane. Let’s be generous and say that a single programmer can conceive, write, and debug 100 lines of code in an 8 hour work day (for applications of this complexity, the number is usually much, much lower). That’s 5 million coding worker-days. Throw 1000 coders at it and you’re looking at 20 years to develop it. No. Just no.

Now this isn’t to say that the thing isn’t likely bloated all to hell… but 500 million lines of code is bullsh*t. No way is that right.

Thoughts, techie readers?

Recently in the Green Room:

Blowback

Note from Hot Air management: This section is for comments from Hot Air's community of registered readers. Please don't assume that Hot Air management agrees with or otherwise endorses any particular comment just because we let it stand. A reminder: Anyone who fails to comply with our terms of use may lose their posting privilege.

Trackbacks/Pings

Trackback URL

Comments

Comment pages: 1 2

Interfacing with other agency’s legacy code is probably accounted for in the 500 mil figure as well as the use of any code libraries.

But yeah, that 500 mil figure is preposterous.

Hendo on October 24, 2013 at 6:04 PM

the only reason you have to debug code is if you wrote crapy code in the first place.

look the development of a swing.

RonK on October 24, 2013 at 6:19 PM

I doubt it.

But only because with a product this screwed-up, I’m betting good money it’s very poorly commented.

PackerBronco on October 24, 2013 at 6:40 PM

I”ve looked at that site a bit, and no doubt it’s complicated, but it’s not THAT complicated. The number of lines of code is even more ridiculous than you’re saying ALLAHPUNDIT.

Every time a person uses software to generate a site, for example Dreamweaver, they sit at a graphical interface, more stuff around, type in a few things, muddle with parameters, and then save it. Hundreds of lines of code is generated in stylesheets, javascripts, other CGI and HTML and they’ve done relatively little work. You’re number “100 lines per hour” is very arbitrary and likely completely inaccurate. One coder could EASILY generate 10,000 lines of code in an hour if he/she added in some scripts and used webdevelopment software.

Having worked as a web developer since 1997, I am rather disgusted by the financials on this site. Which is to say more disgusted than I would be a general citizen of this once-republic.

There is NOTHING on that site that I’m seeing that’s anywhere near complicated enough to require the hundreds of millions of dollars of development costs. Connecting to databases, searching databases, and calculating based on results is NOT that insanely hard. I see Oracle engineers doing far more interesting and complicated things every day in the technology park where I work.

The scandal of this is several orders of magnitude worse tahn what you’re saying.

WashingtonsWake on October 24, 2013 at 6:43 PM

PackerBronco, I suspect there are lots of autogenerated comments.

WashingtonsWake on October 24, 2013 at 6:44 PM

Also, notice this, I just went to healthcare.gov and found taht the ‘frontpage’ has 1639 lines of ‘code’ much of that code is library code built into their rather monstrous template that will be carried onto every page.

If they count lines that way and assume that the site has at least 500 pages, and the spanish version is a seperate page as well. We’re looking at about 2.5 Million lines of code without even seeing anything int he underbelly.

It’s a ridiculous way to talk about a site’s complexity, since a change to the template will change the code on every single page on the site and isn’t terribly complicated at all.

In otherwords, some lame-ass is using big numbers to deceive and mislead.

WashingtonsWake on October 24, 2013 at 6:50 PM

I have been in IT for 30 years and the largest single system I’ve seen was around 12 million lines of code. Systems of this size take 4-5 years to build. I agree that the 500 million quote is nonsense. Nevertheless, this system of systems had no chance of success in the time they were given. Even the best circumstances only gave them 3.5 years. 5 years would be better and you’d want a pilot roll out before the “big bang” roll out. They compressed the schedule too much and probably tried to mitigate by staffing up but that has been proven to make things worse. This failure is not a surprise to me.

BillyWilly on October 24, 2013 at 8:32 PM

Who are you going to believe? The NYT or Obama’s lying eyes?

HiJack on October 24, 2013 at 9:13 PM

500 million lines is BS. The issue is much more probably poorly written interfaces to other systems and the likelihood that some of these other systems return unexpected results and / or take too long to return the expected answer. Some of the issues may actually reside in other systems IRS, VA, etc… that return “bad” data or are slower than expected. So the analysis now becomes what changes to other systems need to be made to get “good” data back in a timely fashion.

EA_MAN on October 24, 2013 at 9:52 PM

As most of the posters here indicate, lines of code isn’t really a useful metric… Nobody typed this turd in Wordpad. Huge chunks were copied and pasted from other sources.

No, the important metric here is the price tag… FOX had a developer on TV that said he didn’t know all the backend requirements, but he estimated off the cuff that if his company built this, the cost would be roughly one million dollars, and it would have WORKED…

This is one of my issues with bloated government… There is ZERO incentive to watch the bottom line, ergo, a 17 trillion dollar debt.

I’ve been telling people for days that even with my limited HTML and PHP knowledge, I could have done this project for ten million… I would have found a black belt web development firm with excellent credentials and references and examples of successful projects, paid them four million to do the development, offered the team a one million dollar bonus for bringing the project in on time and on budget (thoroughly tested to make certain the site functions as required), and pocketed five million for my oversight of the project.

But who listens to a dumbass taxi driver…?

PointnClick on October 24, 2013 at 10:03 PM

Thoughts, techie readers?

If they actually have 500 million lines of code, it’s pretty damn clear they don’t know how to code. Of course, to know that you could also just as easily look at a few news reports.

Stoic Patriot on October 24, 2013 at 10:08 PM

Actually, by standard software metrics, both examples are generally considered 2 lines of code. Comments and spacing for parentheses aren’t counted, and typically they try to count by the number of executing units, rather than code lines.

If you don’t something like this:

foo(x) ? bar(y?a(x):b(x)): wha.this();

would only count as one line of code when, in actuality, I’m pretty sure I’ve gotten it horribly horribly wrong, and it will attempt to do many strange things if you wander by it.

Voyager on October 24, 2013 at 10:19 PM

Thoughts, techie readers?

There are worthless conversations – and completely pointless ones.

This post falls into the latter category.

No one talking or quoted in this post has actually SEEN the program(s)

Therefore – none of you know anything whatsoever.

I thought 16 Trillion was an unreachable number. I was wrong – this administration proved it.

There were AT LEAST FOUR Computer Firms involved. With lots of employees. And an endless government budget.

That many lines of code was definitely within reach.

williamg on October 24, 2013 at 10:20 PM

You’re assuming all the code was written custom for this project.

It is very typical in large and small projects to grab pre-written code and string it together into a larger project. Lawyers do this as well with contracts. You tell a lawyer to write a contract, does he do it from scratch? No. He takes a pre-written contract that largely approximates what he wants and then modifies it to fit. This is standard practice in many programming applications ESPECIALLY with websites. Most websites have VERY similar code.

A really good example would be the hotair website itself. This site uses “WordPress” which is a webdesign package that incorporates a lot of different modules. Changing things around is as simple as installing one prepackaged module and replacing it with another along with perhaps 10 minutes of configuration.

So how many lines of code does the hotair site have? A great many more then anyone working for hotair has ever written. They used either free use code or they used licensed code that was probably pretty inexpensive.

When you look at something like the Obamacare site, best practice would probably be to do something similar. You COULD custom write everything. But why? It just wastes money. You’re basically reinventing the wheel.

So could the final site have 500 million lines of code? Easily. I could make a site that has 500 million lines of code right now in about 10 minutes. Would I have audited even a fraction of that code? Of course not. But its there.

THink of it a bit like the healthcare bill itself. They can pass a bill without reading it. Same thing. Only in this case most websites are actually made this way. Large chunks of pre-written code are strung together into a cohesive whole. The only coding required being a knowledge of what chunks to string together and then writing the code that connects them and configures them.

Karmashock on October 24, 2013 at 10:22 PM

Interesting read about the Space Shuttle software, from 1996.

They Write the Right Stuff
But how much work the software does is not what makes it remarkable. What makes it remarkable is how well the software works. This software never crashes. It never needs to be re-booted. This software is bug-free. It is perfect, as perfect as human beings have achieved. Consider these stats : the last three versions of the program — each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors.

“the shuttle software group is one of just four outfits in the world to win the coveted Level 5 ranking of the federal governments Software Engineering Institute (SEI) a measure of the sophistication and reliability of the way they do their work. In fact, the SEI based it standards in part from watching the on-board shuttle group do its work.”

And Obama hired Script Kiddies

BDU-33 on October 24, 2013 at 10:45 PM

The metric lines of code typically comes from what the compiler reports. That includes every line of code included by libraries included (written by other developers, purchased, part of the particular language or open source). That number isn’t out of line considering the complexity of what they are building, at least based upon my experience developing applications with a similar level of complexity.

peregrin on October 24, 2013 at 10:58 PM

My first thought is in line what many here are saying, that it certainly isn’t lines of unique code written explicitly for the website and only for the website. Maybe they decided to use the Boost library, which has about 20 million lines of code. And Apache, which has about 2 million. You can get up to 500 million pretty quickly with enough packages, especially if you double-count, which might be appropriate if you’re using different versions for different pieces on different systems. Or for having the same Javascript files included (or auto-generated) in different ways on different pages. You might then say, “Well, that’s not a fair count, then,” but whoever gave the number was probably trying to give some sense of the magnitude of debugging it, not the magnitude of writing it. You never know which of those 500 million lines is steering you wrong, though, in practice, it’s more often your own than that of a well-tested module you decided to include. Buckland might be right that they just saw how much space was taken and calculated code accordingly, including the aforementioned libraries.

My second thought is that this wouldn’t be the first time Obama or his administration has confused orders of magnitude, getting numbers horribly wrong.

By the way, comparing the site to the Linux kernel is very unfair. The kernel was designed to be small. This site was designed to government specifications, which is another matter all together, and probably more of the problem than incompetence on the developers’ part.

calbear on October 24, 2013 at 11:05 PM

I can imagine creating 500,000,000 lines of (ineffectively flailing) code. Conceive of a project that needs 50,000,000 lines of code, then, to communicate to outside processes, every tenth line (on average) invoke a macro that expands in-line to 101 lines. Total is now 500,000,000 lines. Why we don’t use a function call or a subroutine doesn’t matter, requirements forbid them for some reason.

Documentation? :Ha!: In a way, that’s the problem. I’ve long preached “Implement the documentation”, by which I meant taking the requirements documents and expanding them to the point that they contained the design and programming specs, and all there was left to do was code those. In this case, the requirements are those God knows how many pages of law and regulation (which are probably incomplete, inconsistent, and at times completely contradictory.) Good luck with writing a requirements document.

Any day I expect to see an advertisement for “tech-surge” programmers who know how to “Use the Force!”, and when to “Cross circuit to B!”

htom on October 24, 2013 at 11:38 PM

Yeah 500 million lines of this stuff
Rep nop

stuartm80127 on October 25, 2013 at 12:03 AM

the only reason you have to debug code is if you wrote crapy code in the first place.

look the development of a swing.

RonK on October 24, 2013 at 6:19 PM

This is less true now than it used to be. With object-oriented programming being used in more and more projects, it’s becoming increasngly common for specialized programming teams to work on discreet “objects” for complex coding. Those objects have to be able to work with each other in the greater program, which is why you don’t see somee errors until runtime. It’s also why there are always debug stages in programming, even if the bugs caught are trivial.

gryphon202 on October 25, 2013 at 6:50 AM

The most complex software in the world is generally thought to have on the order of 10′s of millions of lines of code (Windows, Linux, etc), so the thought of 500 million lines for a DOA web application is highly unlikely. Even a million lines of code (quality code, that is) takes a very long time to write, debug, and test, certainly more than what was given to the contractors.

And for everyone who says LOC is not a valuable metric… well obviously. But it exists and people like to talk about it, so chill out.

raxx on October 25, 2013 at 8:40 AM

So what’s the big deal in all of this? Just have Al Gore fix it. He invented the internet, remember?

NoPain on October 24, 2013 at 2:50 PM

I think he only invented some of the internets

neuquenguy on October 25, 2013 at 9:32 AM

My friendly IT guy says it’s more likely 5 million lines. Which he says is still a Hell of a lot of code to go through to debug.

One point; a half-billion LOCs, or even five million, doesn’t guarantee that all of them are necessary, or even working. Test data, uncompleted sections, abandoned sections (see City of Heroes), and cut-and-paste of freeware (or “pirated” ware) all count. As does duplication of effort by different “compartmentalized” units who weren’t cross-checking with each other. This was developed in extreme (some would say paranoid) secrecy by the O-bots, and coordination between units (who were regarded as “outsiders” and not to be fully trusted- note their remarks on “talking to Congress”)was strongly discouraged.

Which means that instead of one module to “draw a circle”, there might be thirty or forty duplicates of the exact same code in different sections, each one pasted in by a developer who didn’t know ten other people in ten other units were doing the exact same thing.

Add in the apparent use of outdated software that’s prone to “spaghetti coding” anyway, and you have the makings of a clusterf**k no matter how many- or- few- actual LOCs are involved.

clear ether

eon

eon on October 25, 2013 at 10:52 AM

Thoughts, techie readers?

Recall that they already got busted for using DataTables (a very large piece of software) without including the copyright notice, so they didn’t exactly get to 500 million by starting at zero.

Ronnie on October 25, 2013 at 11:09 AM

I thought it was 5 million, not 500 million?

Mimzey on October 25, 2013 at 9:23 PM

No one so far has addressed any software testing, which needs to be done by an entity entirely independent of the SW development team. Testing by the originating team is like incest – never good.

Assume a part of the code requires 10 software processes, each of which can invoke any of 10 more nested processes, and each of those, 10 more. 30 processes now spawn 10 x 10 x10 or 1,000 tests!! Now, put that into a macro regime and just the testing can take months!!

I have been there, and it’s not real pretty.

crankybutt on October 25, 2013 at 9:47 PM

First of all the appropriate metric is “Lines of UNCOMMENTED Source Code.” That means that all the comments in the example don’t count.

Second, this is only a first-order metric at best. 500 lines of well-organized, linear code can be a snap to deal with. 50 lines of tangled code with interlocking data dependencies can be a nightmare.

Third, we’re not just talking about the website. We’re not just talking about the website, and the website is not just what’s loaded into the browser. The back end of this brobdingnagian gobbler has to communicate with hundreds of databases run by dozens of government agencies and dozens or hundreds of private companies, many of which have several different IT fiefdoms each determined not to let anyone do anything the others’ way. Every communication is a transaction, every communication has to work right, and the entire dance of incompatible and uncooperative systems has to work with the precision of a ballarina on a balance sheet.

When you add up all the code involved–all the code that may have to have bits and pieces added and tweaked and (re)-tested–500 megalines does not sound out of line. And to make even 500 kilolines work correctly under these circumstances would be a feat to equal Project Apollo, the Manhattan Project, and the building of the Pentagon–all managed out of one spreadheet.

I won’t wish them luck on the work. I won’t wish that they don’t lose their jobs. But I can’t, in good conscience, wish them the heart attacks, panic attacks, strokes, mental breakdowns, cirrhotic livers, and divorces that this is going to cost if they are really determined to fix it.

njcommuter on October 25, 2013 at 10:00 PM

I know that this was not a DoD contract, but if it was, all DoD contracts requiring new, unique software invoke the requirements of the Carnegie Mellon Software Engineering Institute’s rigid requirements for development and testing. Many large defense projects use millions of lines of code each. (The SEI is formed as an FFRDC, a Federally Funded R&D Center and was the result of many early SW programs that ended up like this HHS debacle, ending in defense cost overruns and schedule delays). These projects require a LOT of unique code and incorporate a lot of algorithmic processes that are almost universally unique – requiring original code. The oversight and testing has been tweeked over the years to nearly eliminate or drastically minimize these “rollout” disasters. And, yes, the software test team is formed independent from the development team.

This admin is so bloated and incompetent that it is utterly inconceivable that no one could see this coming despite the fact that it was a domestic program. Heads should roll for incompetence on any scale, but hey, you get what you vote for!!!

crankybutt on October 25, 2013 at 10:03 PM

crankybutt on October 25, 2013 at 10:27 PM

Comment pages: 1 2