Not quite. In October 2003, NASA released a report on the Columbia Space Shuttle Disaster. When I reviewed the report, I urged readers to study Chapter 8, which was written by Dianne Vaugh, who wrote the classic work on the original Challenger disaster. Vaugh explores the systemic failures of the NASA safety system and how the problems uncovered after the Challenger disaster reappeared to cause the Columbia's problems. The most interesting parts of the report focuses on the management system problems rather than individual failures. Vaughn cautions however that
the Board's focus on the context in which decision making occurred does not mean that individuals are not responsible and accountable. To the contrary, individuals always must assume responsibility for their actions. What it does mean is that NASA's problems cannot be solved simply by retirements, resignations, or transferring personnel.The footnote accompanying this paragraph states
Changing personnel is a typical response after an organization has some kind of harmful outcome. It has great symbolic value. A change in personnel points to individuals as the cause and removing them gives the false impression that the problems have been solved, leaving unresolved organizational system problems.The fact is that human beings inevitably make errors and errors by operators must be expected. But rather than focusing on the operators who make the errors, effective accident analysis – analysis that actually wants to get to the root causes and effective solutions -- looks for the conditions which made the errors possible.
These errors can be rooted in poor design, gaps in supervision, undetected manufacturing defect or maintenance failures, unworkable procedures, shortfalls in training, less than adequate tools and equipment. In addition, these conditions can be present for many years before they combine to result in a tragic incident. In fact, BP made the point that they had been operating with questionable equipment for many years with no problem.
Lets take a short look at the stories behind the headlines above.
According to an Interim Report issued by BP yesterday, the Texas City refinery incident occurred in the isomerization (ISOM) unit. A processing tower, called the raffinate splitter that housed hydrocarbon liquid and vapor, overfilled and overheated. The liquid and vapor mix was overpressurized, flooded into an adjacent Blowdown Drum & Stack, overflowed and escaped into the atmosphere around the unit. The resulting vapor cloud was then ignited by a still-unknown source.
The basic message of the press conference was that worker error was to blame:
If ISOM unit managers had properly supervised the startup or if ISOM unit operators had followed procedures or taken corrective action earlier, the explosion would not have occurred, the investigation team said….. "The mistakes made during the startup of this unit were surprising and deeply disturbing. The result was an extraordinary tragedy we didn't foresee," said Ross Pillari, president of BP Products North America, Inc.Reading more deeply into BP's report, however, one finds two factors that actually get much closer to the root causes of this incident
- The alternative to using the blowdown stack is a flare system that burns off the excess material. In fact, the report states that “Blowdown stacks have been recognized as potentially hazardous for this type of service, and the industry has moved more towards closed relief systems to flare” and that ”The investigation team also concluded the use of a flare system, instead of a blow down stack, would have reduced the severity of the incident." In fact the report noted that there were several times over the past ten years when the relief line could have been tied into a safer flare system, but that “the true level of the hazard was not seen.” In fact, use of the blowdown stack was increased and changes were made to reduce its effectiveness over the past several years.
- The reason so many people were killed is that they were located in trailers directly adjacent to the blowdown stack. Turns out that the Texas City Refinery has a management of change process to evaluate hazards associated with the placement of temporary structures. This process was designed to ensure that the trailers were safe to use and that they were put in a safe place. Although these hazard reviews were conducted prior to placing the trailers, they “did not recognize the possibility that multiple failures by ISOM unit personnel could result in such a massive flow of fluids and vapors to the blow down stack.” Pillari noted that “Plans could have been made to move them away before the startup operation”
Consequently, BP is firing several workers and disciplining others.
Crucial to any root cause investigation, however, is one word: “Why?” Investigators need to keep asking “why?” until the root causes are identified. For example, why didn’t workers follow proper procedures? Were they lazy and incompetent, smoking weed and napping? Or were the procedures too complicated? Were the procedures normally followed to the letter, or generally ignored or circumvented? Were workers adequately trained to respond to this type of emergency even though it had never happened before and, according to the report, was never anticipated? Were the operators too overwhelmed with handling the emergency itself to think about sounding the alarm? And why were supervisors absent during critical periods? Was it common practice for them to be absent? And whose responsibility was it to address “confusion about who was in charge.”
I don’t know the answers to any of these questions, but the need to be asked.
One of the few reporters who seem to have actually read the report was Dina Cappiello of the Houston Chronicle who wrote an article, based on BP’s report, about the company’s failure to replace the blowdown stack:
BP continued to release dangerous and flammable vapors from a ventilation stack at its Texas City refinery, despite chances over the last decade to replace it, an internal investigation by the company has found.Another Chronicle article covered the reaction of the union and others critical of the BP report:
While other refineries swapped the outdated stacks with more modern flares that burn off gases, BP passed on two opportunities — in 1995 and 2002 — to replace the 50-year-old vent stack that erupted into a geyser of flammable vapor and liquid March 23 after a nearby tower was overfilled and overheated.
That choice likely led to the explosion being called one of the deadliest industrial accidents in U.S. history, said Ross Pillari, president of BP Products North America at the investigation's release.
"The report notes that ... use of a flare system, instead of a blowdown stack, would have reduced the severity of the incident," said Pillari. "There was other work going on in the refinery and these would have been opportunities to take this unit to a flare. There is no documentation as to why this didn't happen."
Union officials, victims and attorneys representing dozens of injured workers or the families of the deceased, said Pillari made scapegoats of the low-level refinery workers while sidestepping management's own responsibility.***
"Blaming workers doesn't solve the problem of unsafe conditions in that refinery," said Gary Beevers, Region 6 director of the United Steelworkers union.
Then there was the article about Secretary of Transportation Norman Mineta blaming worker error for the Graniteville, South Carolina accident last January that released chlorine, killing 9 workers.
Preliminary findings in the Jan. 6 Graniteville wreck, which killed nine people and injured hundreds, have placed the blame on the crew of a Norfolk Southern train who failed to switch the main track into its proper position. An oncoming train then crashed into the parked cars on the side spur, rupturing a chlorine tanker and releasing a toxic cloud over the tiny textile town about 60 miles southwest of here. Some 5,400 residents were evacuated.The plan that Mineta proposed contained a number of measures that go far beyond just preventing workers from screwing up, including requiring
That type of human error, the largest single factor that accounted for 38 percent of all train accidents in the past five years, is not addressed by Federal Railroad Administration regulations, Mineta said. Railroad company operating rules address human error, and employees who violate those rules can be disciplined or dismissed.
more training from the federal agency and possible civil penalties. In the worst case, employees could be barred from certain train assignments, said Dan Smith, the federal agency's associate administrator for safety.Again, the headline and Mineta’s main message focus on worker error, although reading deeper you find mention of fatigue and lack of warning devices. I wrote last week about how rail scheduling issues and antiquated regulation put train crews in a permanent state of jet lag.
The plan also would address crew fatigue, help develop technology that can alert crews to broken rails and improve hazardous materials safety by letting local emergency workers know immediately what material could be involved in a crash.
Rebecca Schmidt of West Columbia, who lost her son, 28-year-old train engineer Chris Seeling, in the accident, had a pretty good handle on the problems faced by train crews:
"I'm really excited about this and hope something positive comes out of it - especially the electric signals and I know that fatigue is a huge issue,...I definitely think that you cannot rely on human judgment, especially when a crew has worked 12 hours and they're tired. There needs to be some type of electronic signal," she said, as a train roared through the city of Columbia, blasting its horn.
She also said there should be a clean air supply on trains and more should be done to reduce speeds.
The point is that human error may be one of the "direct causes" of an incident, but it’s almost never one of the root causes. A direct cause is the action that directly results in the occurrence, while root causes are usually management system problems which, if corrected, would not only have prevented that specific problem, but other similar problems as well.
The problem with solely blaming (and firing) workers, you’re taking actions that will prevent future incidents. If, as in the BP case, the root causes had more to do with the management systems that allowed the continued use the blowdown drums and located the trailers in the danger zone, then just firing a few workers who didn’t follow proper procedures (which may have been confusing) isn’t going to keep the same incident from happening again. And disciplining workers for not following proper rail procedures isn't going to be too effective if scheduling issues mean that no on is getting enough sleep.
Despite the "headlines" from their report, BP itself obviously knows better than to just blame the workers. In addition to firing and disciplining employees, they announced that they will modify or replace all blow down systems which handle heavier-than- air hydrocarbon vapor or light hydrocarbon liquids and locate trailers far from any danger areas.
BP News release here
PIllari Statement here
- "Engineers and conductors sleep on trains. Anyone who tells you different is not being straight with you," May 9, 2005
- Deaths and Injuries at US Steel: Blame the Workers?, Feb 24, 2005
- Behavioral Safety Comes To The Railroads, November 15, 2004
- Blame the Worker: Chinese Style, January 3, 2004
- Worker Error Department (cont'd), April 14, 2003
- Worker Error Department, Part 2 April 11, 2003
- Worker Error Department, April 10, 2003