Even though we still know very little about why healthcare.gov, the Obamacare Health Information Exchange wasn’t ready on time, we may know enough to learn some lessons about how to avoid some of its problems. The problems, of course, are the classic ones: the project isn’t completed on time, and it’s over budget.
Curb Your Optimism
In his excellent book, Thinking Fast and Slow, Daniel Kahneman points out how humans have a consistent tendency to overestimate their ability to complete projects as predicted. He suggests that instead of relying of the estimates of the people intending to do the work, “baseline information” be collected. This is the record of how similar projects have done in the past.
Collecting baseline information would have suggested that bringing the project in on time and under budget would require at least special effort and excellent project management. As several people have pointed out in hindsight, large government IT projects do not have particularly good track records. A recent article in Computerworld quoted the Standish Group, which has a database of about 50,000 development projects. Of 3,555 projects from 2003 to 2012 that had labor costs of at least $10 million, only 6.4% were successful. 41.4% were outright failures. The remaining 52% were either over budget, behind schedule or didn’t meet user expectations.
Given this baseline information, the overwhelming likelihood was that the project would not succeed as planned. Time at the beginning to consider extraordinary measures or reduce your expectations.
At the risk of sounding like a reference librarian, another good book is Nassim Nicholas Taleb’s Antifragile. To oversimplify a lesson from that book, it’s better to look at the structure of a venture than it is to try and predict the future events that will affect it. Computer projects, in particular, have a tendency to be fragile. This is not really a new concept. Back in 1998, William J. Brown and his colleagues wrote this in their classic study of why software development projects fail, AntiPatterns: “If an object’s interface is unique and there are no other implementations that support the same interface, then the interface is implementation-dependent. Every client or object that uses this interface is dependent upon unique implementation details. When this occurs repeatedly on a system-wide scale, a brittle system is created.”
So was healthcare.gov brittle or fragile? I don’t really know, of course, but here’s why I think it may have been. Consider this high-level functional vision of a health insurance exchange:
The part of this diagram that makes me nervous is the nasty little lightning bolts. Although I don’t know how dependent user registration is on connecting to these external databases in real time, certainly building a complete application is, as well as getting the rates from insurance companies. But, hey, we’ve got standards-based Web services to take care of that, right? What could go wrong? Here’s one opinion, from Dan Hanson:
“The real problems are with the back end of the software. When you try to get a quote for health insurance, the system has to connect to computers at the IRS, the VA, Medicaid/CHIP, various state agencies, Treasury and HHS. They also have to connect to all the health plan carriers to get pre-subsidy pricing. All of these queries receive data that is then fed into the online calculator to give you a price. If any of these queries fails, the whole transaction fails.
“Most of these systems are old legacy systems with their own unique data formats. Some have been around since the 1960’s, and the people who wrote the code that runs on them are long gone. If one of these old crappy systems takes too long to respond, the transaction times out.”
Although I don’t know who Dan is, this has a ring of truth. If he’s right, and I’ve seen some reports of the Exchange failing because of unsuccessful connections from the data hub, the system may indeed be brittle or fragile, with one messy interface causing cascading issues in other places, particularly for a real-time process dependent on so many of them. And the kicker is that this is the kind of thing that could have been checked out at the start. The proposed architecture was there, and the systems that had to work with it were identifiable.
Pay Attention to Warnings
A cursory reading of media reports about healthcare.gov can give the idea that all the people in charge of the project thought it was going to work, but that testing wasn’t done until September, at which time there wasn’t enough time to fix things. Several armchair quarterbacks have said that only a month for testing was not enough, and that it should have been four or five months.
My quick Internet review of secondary sources didn’t take long to turn up serious warning signs going back quite a while.
The original idea for healthcare information exchanges was that each State would have one, and that they would coordinate with a central Federal data hub to provide the information from IRS, Homeland Security, the VA, Social Security, HHS and the Treasury Department. As time passed, however, a large number of States, including those who also refused to expand Medicaid, decided not to implement healthcare exchanges. The fallback position for people in these States without exchanges was to go to healthcare.gov. Another group of States was committed to having an exchange but was moving slowly. Their users would go to healthcare.gov until their exchanges were ready. By October 1, 2013, only 17 States had functioning healthcare information exchanges, people in the other states were directed to healthcare.gov. The deadline for States committing to their own exchanges was early 2013. Until then, what load to expect on healthcare.gov wasn’t really known.
In March 2013, the deputy CIO at Center for Medicare & Medicaid Services (CMS), the Obamacare agency responsible for managing the project, was quoted as telling an insurance industry meeting that he was “pretty nervous” about the exchanges being ready by October 1. According to contracts reviewed by Reuters, the ceiling amount to build the website tripled from $93.7 million to $292 million in April. Inside the project, then, something appears not to have been going to plan.
By June 2013, the United States Government Accountability Office, issued a report that said “Much remains to be accomplished within a relatively short amount of time. CMS’s timelines and targeted completion dates provide a roadmap to completion of the required activities by the start of enrollment on October 1, 2013. However, certain factors, such as the still-unknown and evolving scope of the exchange activities CMS will be required to perform in each state, and the large numbers of activities remaining to be performed—some close to the start of enrollment—suggest a potential for implementation challenges going forward. And while the missed interim deadlines may not affect implementation, additional missed deadlines closer to the start of enrollment could do so.” I think this is GAO-speak for “Uh oh!”
Interestingly, a timeline in the report supplied by CMS showed that final testing already was planned for September 2013. The GAO report was published in June. At least by June, then, CMS, knew that only a small time could be allotted for testing.
Admittedly, the whole project was wrapped up in politics. Perhaps it just wasn’t seen as feasible that the launch of the Website could be delayed until it was ready to roll, even though people were aware that it might not work on time. But what could have been done? Clay Shirkey, writing in The Guardian, had a suggestion:
“The lesson from this launch isn’t just about technological management; it’s about the ability of officials to re3ceive and act on bad news – surely a core function. Before Obama’s remarks on launch day, someone should have said, ‘Hey, chief, say it’s a soft launch, or the first day of public testing. Say we need feedback, or that we’re going to fine-tune it. Say anything but, “Come and get it.”’ That, of course, didn’t happen.”
Follow Appropriate Project Management Procedures
So many other things may have been wrong. Apparently, coordination among the literally dozens of contractors working on the project was inferior. The CMS apparently had no project management experience for a project of this scale in house. One of the contractors, CGI, apparently had a record of failure on at least two similar projects. The system’s functioning was dependent on getting information from States that refused to run their own exchanges. The cup of criticism runs over.
At the end, it may not be what we’ve heard but what we’ve not heard that matters. Has anybody talked about their project management methodology? Has anybody from the project actually shown us the timelines and how they changed? ITIL, the IT management system, is often criticized for being too big, expensive and demanding. But not for a project like this.
For us, people not spending hundreds of millions of dollars on an IT project, these lessons still apply. I’ve certainly seen foundations where program areas have hired outside developers to build custom software, and I’ve seen those projects go astray, for similar, even if smaller, reasons. Somehow, every one of these projects needs solid project management.