Showing posts with label safety. Show all posts
Showing posts with label safety. Show all posts

Sunday, May 9, 2010

System safety engineering and the Deepwater Horizon


The Deepwater Horizon oil rig explosion, fire, and uncontrolled release of oil into the Gulf is a disaster of unprecedented magnitude.  This disaster in the Gulf of Mexico appears to be more serious in objective terms than the Challenger space shuttle disaster in 1986 -- in terms both of immediate loss of life and in terms of overall harm created. And sadly, it appears likely that the accident will reveal equally severe failures of management of enormously hazardous processes, defects in the associated safety engineering analysis, and inadequacies of the regulatory environment within which the activity took place.  The Challenger disaster fundamentally changed the ways that we thought about safety in the aerospace field.  It is likely that this disaster too will force radical new thinking and new procedures concerning how to deal with the inherently dangerous processes associated with deep-ocean drilling.

Nancy Leveson is an important expert in the area of systems safety engineering, and her book, Safeware: System Safety and Computers, is a genuinely important contribution.  Leveson led the investigation of the role that software design might have played in the Challenger disaster (link).  Here is a short, readable white paper of hers on system safety engineering (link) that is highly relevant to the discussions that will need to occur about deep-ocean drilling.  The paper does a great job of laying out how safety has been analyzed in several high-hazard industries, and presents a set of basic principles for systems safety design.  She discusses aviation, the nuclear industry, military aerospace, and the chemical industry; and she points out some important differences across industries when it comes to safety engineering.  Here is an instructive description of the safety situation in military aerospace in the 1950s and 1960s:
Within 18 months after the fleet of 71 Atlas F missiles became operational, four blew up in their silos during operational testing. The missiles also had an extremely low launch success rate.  An Air Force manual describes several of these accidents: 
     An ICBM silo was destroyed because the counterweights, used to balance the silo elevator on the way up and down in the silo, were designed with consideration only to raising a fueled missile to the surface for firing. There was no consideration that, when you were not firing in anger, you had to bring the fueled missile back down to defuel. 
     The first operation with a fueled missile was nearly successful. The drive mechanism held it for all but the last five feet when gravity took over and the missile dropped back. Very suddenly, the 40-foot diameter silo was altered to about 100-foot diameter. 
     During operational tests on another silo, the decision was made to continue a test against the safety engineer’s advice when all indications were that, because of high oxygen concentrations in the silo, a catastrophe was imminent. The resulting fire destroyed a missile and caused extensive silo damage. In another accident, five people were killed when a single-point failure in a hydraulic system caused a 120-ton door to fall. 
     Launch failures were caused by reversed gyros, reversed electrical plugs, bypass of procedural steps, and by management decisions to continue, in spite of contrary indications, because of schedule pressures. (from the Air Force System Safety Handbook for Acquisition Managers, Air Force Space Division, January 1984)
Leveson's illustrations from the history of these industries are fascinating.  But even more valuable are the principles of safety engineering that she recapitulates.  These principles seem to have many implications for deep-ocean drilling and associated technologies and systems.  Here is her definition of systems safety:
System safety uses systems theory and systems engineering approaches to prevent foreseeable accidents and to minimize the result of unforeseen ones.  Losses in general, not just human death or injury, are considered. Such losses may include destruction of property, loss of mission, and environmental harm. The primary concern of system safety is the management of hazards: their identification, evaluation, elimination, and control through analysis, design and management procedures.
Here are several fundamental principles of designing safe systems that she discusses:
  • System safety emphasizes building in safety, not adding it on to a completed design.
  • System safety deals with systems as a whole rather than with subsystems or components.
  • System safety takes a larger view of hazards than just failures.
  • System safety emphasizes analysis rather than past experience and standards.
  • System safety emphasizes qualitative rather than quantitative approaches.
  • Recognition of tradeoffs and conflicts.
  • System safety is more than just system engineering.
And here is an important summary observation about the complexity of safe systems:
Safety is an emergent property that arises at the system level when components are operating together. The events leading to an accident may be a complex combination of equipment failure, faulty maintenance, instrumentation and control problems, human actions, and design errors. Reliability analysis considers only the possibility of accidents related to failures; it does not investigate potential damage that could result from successful operation of the individual components.

How do these principles apply to the engineering problem of deep-ocean drilling?  Perhaps the most important implications are these: a safe system needs to be based on careful and comprehensive analysis of the hazards that are inherently involved in the process; it needs to be designed with an eye to handling those hazards safely; and it can't be done in a piecemeal, "fly-test-fly" fashion.

It would appear that deep-ocean drilling is characterized by too little analysis and too much confidence in the ability of engineers to "correct" inadvertent outcomes ("fly-fix-fly").  The accident that occurred in the Gulf last month can be analyzed into several parts. First is the explosion and fire that destroyed the drilling rig and led to the tragic loss of life of 11 rig workers. And the second is the uncalculated harms caused by the uncontrolled venting of perhaps a hundred thousand barrels of crude oil to date into the Gulf of Mexico, now threatening the coasts and ecologies of several states.  Shockingly, there is now no high-reliability method for capping the well at a depth of over 5,000 feet; so the harm can continue to worsen for a very extended period of time.

The safety systems on the platform itself will need to be examined in detail. But the bottom line will probably look something like this: the platform is a complex system vulnerable to explosion and fire, and there was always a calculable (though presumably small) probability of catastrophic fire and loss of the ship. This is pretty analogous to the problem of safety in aircraft and other complex electro-mechanical systems. The loss of life in the incident is terrible but confined.  Planes crash and ships sink.

What elevates this accident to a globally important catastrophe is what happened next: destruction of the pipeline leading from the wellhead 5,000 feet below sea level to containers on the surface; and the failure of the shutoff valve system on the ocean floor. These two failures have resulted in unconstrained release of a massive and uncontrollable flow of crude oil into the Gulf and the likelihood of environmental harms that are likely to be greater than the Exxon Valdez.

Oil wells fail on the surface, and they are difficult to control. But there is a well-developed technology that teams of oil fire specialists like Red Adair employ to cap the flow and end the damage. We don't have anything like this for wells drilled under water at the depth of this incident; this accident is less accessible than objects in space for corrective intervention. So surface well failures conform to a sort of epsilon-delta relationship: an epsilon accident leads to a limited delta harm. This deep-ocean well failure in the Gulf is catastrophically different: the relatively small incident on the surface is resulting in an unbounded and spiraling harm.

So was this a foreseeable hazard? Of course it was. There was always a finite probability of total loss of the platform, leading to destruction of the pipeline. There was also a finite probability of failure of the massive sea-floor emergency shutoff valve. And, critically, it was certainly known that there is no high-reliability fix in the event of failure of the shutoff valve. The effort to use the dome currently being tried by BP is untested and unproven at this great depth. The alternative of drilling a second well to relieve pressure may work; but it will take weeks or months. So essentially, when we reach the end of this failure pathway, we arrive at this conclusion: catastrophic, unbounded failure. If you reach this point in the fault tree, there is almost nothing to be done. And this is a totally irrational outcome to tolerate; how could any engineer or regulatory agency have accepted the circumstances of this activity, given that one possible failure pathway would lead predictably to unbounded harms?

There is one line of thought that might have led to the conclusion that deep ocean drilling is acceptably safe: engineers and policy makers might have optimistically overestimated the reliability of the critical components. If we estimate that the probability of failure of the platform is 1/1000, failure of the pipeline is 1/100, and failure of the emergency shutoff valve is 1/10,000 -- then one might say that the probability of the nightmare scenario is vanishingly small: one in a billion. Perhaps one might reason that we can disregard scenarios with this level of likelihood. Reasoning very much like this was involved in the original safety designs of the shuttle (Safeware: System Safety and Computers). But several things are now clear: this disaster was not virtually impossible. In fact, it actually occurred. And second, it seems likely enough that the estimates of component failure are badly understated.

What does this imply about deep ocean drilling? It seems inescapable that the current state of technology does not permit us to take the risk of this kind of total systems failure. Until there is a reliable and reasonably quick technology for capping a deep-ocean well, the small probability of this kind of failure makes the use of the technology entirely unjustifiable. It makes no sense at all to play Russian roulette when the cost of failure is massive and unconstrained ecological damage.

There is another aspect of this disaster that needs to be called out, and that is the issue of regulation. Just as the nuclear industry requires close, rigorous regulation and inspection, so deep-ocean drilling must be rigorously regulated. The stakes are too high to allow the oil industry to regulate itself. And unfortunately there are clear indications of weak regulation in this industry (link).

(Here are links to a couple of earlier posts on safety and technology failure (link, link).)

System safety engineering and the Deepwater Horizon


The Deepwater Horizon oil rig explosion, fire, and uncontrolled release of oil into the Gulf is a disaster of unprecedented magnitude.  This disaster in the Gulf of Mexico appears to be more serious in objective terms than the Challenger space shuttle disaster in 1986 -- in terms both of immediate loss of life and in terms of overall harm created. And sadly, it appears likely that the accident will reveal equally severe failures of management of enormously hazardous processes, defects in the associated safety engineering analysis, and inadequacies of the regulatory environment within which the activity took place.  The Challenger disaster fundamentally changed the ways that we thought about safety in the aerospace field.  It is likely that this disaster too will force radical new thinking and new procedures concerning how to deal with the inherently dangerous processes associated with deep-ocean drilling.

Nancy Leveson is an important expert in the area of systems safety engineering, and her book, Safeware: System Safety and Computers, is a genuinely important contribution.  Leveson led the investigation of the role that software design might have played in the Challenger disaster (link).  Here is a short, readable white paper of hers on system safety engineering (link) that is highly relevant to the discussions that will need to occur about deep-ocean drilling.  The paper does a great job of laying out how safety has been analyzed in several high-hazard industries, and presents a set of basic principles for systems safety design.  She discusses aviation, the nuclear industry, military aerospace, and the chemical industry; and she points out some important differences across industries when it comes to safety engineering.  Here is an instructive description of the safety situation in military aerospace in the 1950s and 1960s:
Within 18 months after the fleet of 71 Atlas F missiles became operational, four blew up in their silos during operational testing. The missiles also had an extremely low launch success rate.  An Air Force manual describes several of these accidents: 
     An ICBM silo was destroyed because the counterweights, used to balance the silo elevator on the way up and down in the silo, were designed with consideration only to raising a fueled missile to the surface for firing. There was no consideration that, when you were not firing in anger, you had to bring the fueled missile back down to defuel. 
     The first operation with a fueled missile was nearly successful. The drive mechanism held it for all but the last five feet when gravity took over and the missile dropped back. Very suddenly, the 40-foot diameter silo was altered to about 100-foot diameter. 
     During operational tests on another silo, the decision was made to continue a test against the safety engineer’s advice when all indications were that, because of high oxygen concentrations in the silo, a catastrophe was imminent. The resulting fire destroyed a missile and caused extensive silo damage. In another accident, five people were killed when a single-point failure in a hydraulic system caused a 120-ton door to fall. 
     Launch failures were caused by reversed gyros, reversed electrical plugs, bypass of procedural steps, and by management decisions to continue, in spite of contrary indications, because of schedule pressures. (from the Air Force System Safety Handbook for Acquisition Managers, Air Force Space Division, January 1984)
Leveson's illustrations from the history of these industries are fascinating.  But even more valuable are the principles of safety engineering that she recapitulates.  These principles seem to have many implications for deep-ocean drilling and associated technologies and systems.  Here is her definition of systems safety:
System safety uses systems theory and systems engineering approaches to prevent foreseeable accidents and to minimize the result of unforeseen ones.  Losses in general, not just human death or injury, are considered. Such losses may include destruction of property, loss of mission, and environmental harm. The primary concern of system safety is the management of hazards: their identification, evaluation, elimination, and control through analysis, design and management procedures.
Here are several fundamental principles of designing safe systems that she discusses:
  • System safety emphasizes building in safety, not adding it on to a completed design.
  • System safety deals with systems as a whole rather than with subsystems or components.
  • System safety takes a larger view of hazards than just failures.
  • System safety emphasizes analysis rather than past experience and standards.
  • System safety emphasizes qualitative rather than quantitative approaches.
  • Recognition of tradeoffs and conflicts.
  • System safety is more than just system engineering.
And here is an important summary observation about the complexity of safe systems:
Safety is an emergent property that arises at the system level when components are operating together. The events leading to an accident may be a complex combination of equipment failure, faulty maintenance, instrumentation and control problems, human actions, and design errors. Reliability analysis considers only the possibility of accidents related to failures; it does not investigate potential damage that could result from successful operation of the individual components.

How do these principles apply to the engineering problem of deep-ocean drilling?  Perhaps the most important implications are these: a safe system needs to be based on careful and comprehensive analysis of the hazards that are inherently involved in the process; it needs to be designed with an eye to handling those hazards safely; and it can't be done in a piecemeal, "fly-test-fly" fashion.

It would appear that deep-ocean drilling is characterized by too little analysis and too much confidence in the ability of engineers to "correct" inadvertent outcomes ("fly-fix-fly").  The accident that occurred in the Gulf last month can be analyzed into several parts. First is the explosion and fire that destroyed the drilling rig and led to the tragic loss of life of 11 rig workers. And the second is the uncalculated harms caused by the uncontrolled venting of perhaps a hundred thousand barrels of crude oil to date into the Gulf of Mexico, now threatening the coasts and ecologies of several states.  Shockingly, there is now no high-reliability method for capping the well at a depth of over 5,000 feet; so the harm can continue to worsen for a very extended period of time.

The safety systems on the platform itself will need to be examined in detail. But the bottom line will probably look something like this: the platform is a complex system vulnerable to explosion and fire, and there was always a calculable (though presumably small) probability of catastrophic fire and loss of the ship. This is pretty analogous to the problem of safety in aircraft and other complex electro-mechanical systems. The loss of life in the incident is terrible but confined.  Planes crash and ships sink.

What elevates this accident to a globally important catastrophe is what happened next: destruction of the pipeline leading from the wellhead 5,000 feet below sea level to containers on the surface; and the failure of the shutoff valve system on the ocean floor. These two failures have resulted in unconstrained release of a massive and uncontrollable flow of crude oil into the Gulf and the likelihood of environmental harms that are likely to be greater than the Exxon Valdez.

Oil wells fail on the surface, and they are difficult to control. But there is a well-developed technology that teams of oil fire specialists like Red Adair employ to cap the flow and end the damage. We don't have anything like this for wells drilled under water at the depth of this incident; this accident is less accessible than objects in space for corrective intervention. So surface well failures conform to a sort of epsilon-delta relationship: an epsilon accident leads to a limited delta harm. This deep-ocean well failure in the Gulf is catastrophically different: the relatively small incident on the surface is resulting in an unbounded and spiraling harm.

So was this a foreseeable hazard? Of course it was. There was always a finite probability of total loss of the platform, leading to destruction of the pipeline. There was also a finite probability of failure of the massive sea-floor emergency shutoff valve. And, critically, it was certainly known that there is no high-reliability fix in the event of failure of the shutoff valve. The effort to use the dome currently being tried by BP is untested and unproven at this great depth. The alternative of drilling a second well to relieve pressure may work; but it will take weeks or months. So essentially, when we reach the end of this failure pathway, we arrive at this conclusion: catastrophic, unbounded failure. If you reach this point in the fault tree, there is almost nothing to be done. And this is a totally irrational outcome to tolerate; how could any engineer or regulatory agency have accepted the circumstances of this activity, given that one possible failure pathway would lead predictably to unbounded harms?

There is one line of thought that might have led to the conclusion that deep ocean drilling is acceptably safe: engineers and policy makers might have optimistically overestimated the reliability of the critical components. If we estimate that the probability of failure of the platform is 1/1000, failure of the pipeline is 1/100, and failure of the emergency shutoff valve is 1/10,000 -- then one might say that the probability of the nightmare scenario is vanishingly small: one in a billion. Perhaps one might reason that we can disregard scenarios with this level of likelihood. Reasoning very much like this was involved in the original safety designs of the shuttle (Safeware: System Safety and Computers). But several things are now clear: this disaster was not virtually impossible. In fact, it actually occurred. And second, it seems likely enough that the estimates of component failure are badly understated.

What does this imply about deep ocean drilling? It seems inescapable that the current state of technology does not permit us to take the risk of this kind of total systems failure. Until there is a reliable and reasonably quick technology for capping a deep-ocean well, the small probability of this kind of failure makes the use of the technology entirely unjustifiable. It makes no sense at all to play Russian roulette when the cost of failure is massive and unconstrained ecological damage.

There is another aspect of this disaster that needs to be called out, and that is the issue of regulation. Just as the nuclear industry requires close, rigorous regulation and inspection, so deep-ocean drilling must be rigorously regulated. The stakes are too high to allow the oil industry to regulate itself. And unfortunately there are clear indications of weak regulation in this industry (link).

(Here are links to a couple of earlier posts on safety and technology failure (link, link).)

System safety engineering and the Deepwater Horizon


The Deepwater Horizon oil rig explosion, fire, and uncontrolled release of oil into the Gulf is a disaster of unprecedented magnitude.  This disaster in the Gulf of Mexico appears to be more serious in objective terms than the Challenger space shuttle disaster in 1986 -- in terms both of immediate loss of life and in terms of overall harm created. And sadly, it appears likely that the accident will reveal equally severe failures of management of enormously hazardous processes, defects in the associated safety engineering analysis, and inadequacies of the regulatory environment within which the activity took place.  The Challenger disaster fundamentally changed the ways that we thought about safety in the aerospace field.  It is likely that this disaster too will force radical new thinking and new procedures concerning how to deal with the inherently dangerous processes associated with deep-ocean drilling.

Nancy Leveson is an important expert in the area of systems safety engineering, and her book, Safeware: System Safety and Computers, is a genuinely important contribution.  Leveson led the investigation of the role that software design might have played in the Challenger disaster (link).  Here is a short, readable white paper of hers on system safety engineering (link) that is highly relevant to the discussions that will need to occur about deep-ocean drilling.  The paper does a great job of laying out how safety has been analyzed in several high-hazard industries, and presents a set of basic principles for systems safety design.  She discusses aviation, the nuclear industry, military aerospace, and the chemical industry; and she points out some important differences across industries when it comes to safety engineering.  Here is an instructive description of the safety situation in military aerospace in the 1950s and 1960s:
Within 18 months after the fleet of 71 Atlas F missiles became operational, four blew up in their silos during operational testing. The missiles also had an extremely low launch success rate.  An Air Force manual describes several of these accidents: 
     An ICBM silo was destroyed because the counterweights, used to balance the silo elevator on the way up and down in the silo, were designed with consideration only to raising a fueled missile to the surface for firing. There was no consideration that, when you were not firing in anger, you had to bring the fueled missile back down to defuel. 
     The first operation with a fueled missile was nearly successful. The drive mechanism held it for all but the last five feet when gravity took over and the missile dropped back. Very suddenly, the 40-foot diameter silo was altered to about 100-foot diameter. 
     During operational tests on another silo, the decision was made to continue a test against the safety engineer’s advice when all indications were that, because of high oxygen concentrations in the silo, a catastrophe was imminent. The resulting fire destroyed a missile and caused extensive silo damage. In another accident, five people were killed when a single-point failure in a hydraulic system caused a 120-ton door to fall. 
     Launch failures were caused by reversed gyros, reversed electrical plugs, bypass of procedural steps, and by management decisions to continue, in spite of contrary indications, because of schedule pressures. (from the Air Force System Safety Handbook for Acquisition Managers, Air Force Space Division, January 1984)
Leveson's illustrations from the history of these industries are fascinating.  But even more valuable are the principles of safety engineering that she recapitulates.  These principles seem to have many implications for deep-ocean drilling and associated technologies and systems.  Here is her definition of systems safety:
System safety uses systems theory and systems engineering approaches to prevent foreseeable accidents and to minimize the result of unforeseen ones.  Losses in general, not just human death or injury, are considered. Such losses may include destruction of property, loss of mission, and environmental harm. The primary concern of system safety is the management of hazards: their identification, evaluation, elimination, and control through analysis, design and management procedures.
Here are several fundamental principles of designing safe systems that she discusses:
  • System safety emphasizes building in safety, not adding it on to a completed design.
  • System safety deals with systems as a whole rather than with subsystems or components.
  • System safety takes a larger view of hazards than just failures.
  • System safety emphasizes analysis rather than past experience and standards.
  • System safety emphasizes qualitative rather than quantitative approaches.
  • Recognition of tradeoffs and conflicts.
  • System safety is more than just system engineering.
And here is an important summary observation about the complexity of safe systems:
Safety is an emergent property that arises at the system level when components are operating together. The events leading to an accident may be a complex combination of equipment failure, faulty maintenance, instrumentation and control problems, human actions, and design errors. Reliability analysis considers only the possibility of accidents related to failures; it does not investigate potential damage that could result from successful operation of the individual components.

How do these principles apply to the engineering problem of deep-ocean drilling?  Perhaps the most important implications are these: a safe system needs to be based on careful and comprehensive analysis of the hazards that are inherently involved in the process; it needs to be designed with an eye to handling those hazards safely; and it can't be done in a piecemeal, "fly-test-fly" fashion.

It would appear that deep-ocean drilling is characterized by too little analysis and too much confidence in the ability of engineers to "correct" inadvertent outcomes ("fly-fix-fly").  The accident that occurred in the Gulf last month can be analyzed into several parts. First is the explosion and fire that destroyed the drilling rig and led to the tragic loss of life of 11 rig workers. And the second is the uncalculated harms caused by the uncontrolled venting of perhaps a hundred thousand barrels of crude oil to date into the Gulf of Mexico, now threatening the coasts and ecologies of several states.  Shockingly, there is now no high-reliability method for capping the well at a depth of over 5,000 feet; so the harm can continue to worsen for a very extended period of time.

The safety systems on the platform itself will need to be examined in detail. But the bottom line will probably look something like this: the platform is a complex system vulnerable to explosion and fire, and there was always a calculable (though presumably small) probability of catastrophic fire and loss of the ship. This is pretty analogous to the problem of safety in aircraft and other complex electro-mechanical systems. The loss of life in the incident is terrible but confined.  Planes crash and ships sink.

What elevates this accident to a globally important catastrophe is what happened next: destruction of the pipeline leading from the wellhead 5,000 feet below sea level to containers on the surface; and the failure of the shutoff valve system on the ocean floor. These two failures have resulted in unconstrained release of a massive and uncontrollable flow of crude oil into the Gulf and the likelihood of environmental harms that are likely to be greater than the Exxon Valdez.

Oil wells fail on the surface, and they are difficult to control. But there is a well-developed technology that teams of oil fire specialists like Red Adair employ to cap the flow and end the damage. We don't have anything like this for wells drilled under water at the depth of this incident; this accident is less accessible than objects in space for corrective intervention. So surface well failures conform to a sort of epsilon-delta relationship: an epsilon accident leads to a limited delta harm. This deep-ocean well failure in the Gulf is catastrophically different: the relatively small incident on the surface is resulting in an unbounded and spiraling harm.

So was this a foreseeable hazard? Of course it was. There was always a finite probability of total loss of the platform, leading to destruction of the pipeline. There was also a finite probability of failure of the massive sea-floor emergency shutoff valve. And, critically, it was certainly known that there is no high-reliability fix in the event of failure of the shutoff valve. The effort to use the dome currently being tried by BP is untested and unproven at this great depth. The alternative of drilling a second well to relieve pressure may work; but it will take weeks or months. So essentially, when we reach the end of this failure pathway, we arrive at this conclusion: catastrophic, unbounded failure. If you reach this point in the fault tree, there is almost nothing to be done. And this is a totally irrational outcome to tolerate; how could any engineer or regulatory agency have accepted the circumstances of this activity, given that one possible failure pathway would lead predictably to unbounded harms?

There is one line of thought that might have led to the conclusion that deep ocean drilling is acceptably safe: engineers and policy makers might have optimistically overestimated the reliability of the critical components. If we estimate that the probability of failure of the platform is 1/1000, failure of the pipeline is 1/100, and failure of the emergency shutoff valve is 1/10,000 -- then one might say that the probability of the nightmare scenario is vanishingly small: one in a billion. Perhaps one might reason that we can disregard scenarios with this level of likelihood. Reasoning very much like this was involved in the original safety designs of the shuttle (Safeware: System Safety and Computers). But several things are now clear: this disaster was not virtually impossible. In fact, it actually occurred. And second, it seems likely enough that the estimates of component failure are badly understated.

What does this imply about deep ocean drilling? It seems inescapable that the current state of technology does not permit us to take the risk of this kind of total systems failure. Until there is a reliable and reasonably quick technology for capping a deep-ocean well, the small probability of this kind of failure makes the use of the technology entirely unjustifiable. It makes no sense at all to play Russian roulette when the cost of failure is massive and unconstrained ecological damage.

There is another aspect of this disaster that needs to be called out, and that is the issue of regulation. Just as the nuclear industry requires close, rigorous regulation and inspection, so deep-ocean drilling must be rigorously regulated. The stakes are too high to allow the oil industry to regulate itself. And unfortunately there are clear indications of weak regulation in this industry (link).

(Here are links to a couple of earlier posts on safety and technology failure (link, link).)

Saturday, August 22, 2009

Patient safety -- Canada and France


Patient safety is a key issue in managing and assessing a regional or national health system. There are very sizable variations in patient safety statistics across hospitals, with significantly higher rates of infection and mortality in some institutions than others. Why is this? And what can be done in order to improve the safety performance of low-safety institutions, and to improve the overall safety performance of the hospital environment nationally?

Previous posts have made the point that safety is the net effect of a complex system within a hospital or chemical plant, including institutions, rules, practices, training, supervision, and day-to-day behavior by staff and supervisors (post, post). And experts on hospital safety agree that improvements in safety require careful analysis of patient processes in order to redesign processes so as to make infections, falls, improper medications, and unnecessary mortality less likely. Institutional design and workplace culture have to change if safety performance is to improve consistently and sustainably. (Here is a posting providing a bit more discussion of the institutions of a hospital; post.)

But here is an important question: what are the features of the social and legal environment that will make it most likely that hospital administrators will commit themselves to a thorough-going culture and management of safety? What incentives or constraints need to exist to offset the impulses of cost-cutting and status quo management that threaten to undermine patient safety? What will drive the institutional change in a health system that improving patient safety requires?

Several measures seem clear. One is state regulation of hospitals. This exists in every state; but the effectiveness of regulatory regimes varies widely across context. So understanding the dynamics of regulation and enforcement is a crucial step to improving hospital quality and patient safety. The oversight of rigorous hospital accreditation agencies is another important factor for improvement. For example, the Joint Commission accredits thousands of hospitals in the United States (web page) through dozens of accreditation and certification programs. Patient safety is the highest priority underlying Joint Commission standards of accreditation. So regulation and the formulation of standards are part of the answer. But a particularly important policy tool for improving safety performance is the mandatory collection and publication of safety statistics, so that potential patients can decide between hospitals on the basis of their safety performance. Publicity and transparency are crucial parts of good management behavior; and secrecy is a refuge of poor performance in areas of public concern such as safety, corruption, or rule-setting. (See an earlier post on the relationship between publicity and corruption.)

But here we have a little bit of a conundrum: achieving mandatory publication of safety statistics is politically difficult, because hospitals have a business interest in keeping these data private. So there was a lot of resistance to mandatory reporting of basic patient safety data in the US over the past twenty years. Fortunately, the public interest in having these data readily available has largely prevailed, and hospitals are now required to publish a broader and broader range of data on patient safety, including hospital-induced infection rates, ventilator-induced pneumonias, patient falls, and mortality rates. Here is a useful tool from USA Today that lets the public and the patient gather information about his/her hospital options and how these compare with other hospitals regionally and nationally. This is an effective accountability mechanism that inevitably drives hospitals towards better performance.

Canada has been very active in this area. Here is a website published by the Ontario Ministry of Health and Long-Term Care. The province requires hospitals to report a number of factors that are good indicators of patient safety: several kinds of hospital-born infections; central-line primary bloodstream infection and ventilator-associated pneumonia; surgical-site infection prevention activity; and hospital-standardized mortality ratio. The user can explore the site and find that there are in fact wide variations across hospitals in the province. This is likely to change patient choice; but it also serves as an instant guide for regulatory agencies and local hospital administrators as they attempt to focus attention on poor management practices and institutional arrangements. (It would be helpful for the purpose of comparison if the data could be easily downloaded into a spreadsheet.)

On first principles, it seems likely that any country that has a hospital system in which the safety performance of each hospital is kept secret will also show a wide distribution of patient safety outcomes across institutions, and will have an overall safety record that is much lower than it could be. This is because secrecy gives hospital administrators the ability to conceal the risks their institutions impose on patients through bad practices. So publicity and regular publication of patient safety information seems to be a necessary precondition to maintaining a high-safety hospital system.

But here is the crucial point: many countries continue to permit secrecy when it comes to hospital safety. In particular, this seems to be true in France. It seems that the French medical and hospital system continues to display a very high degree of secrecy and opacity when it comes to patient safety. In fact, anecdotal information about French hospitals suggests a wide range of levels of hospital-born infections in different hospitals. Hospital-born infections (infections nosocomiales) are an important and rising cause of patient illness and morbidity. And there are well-known practices and technologies that substantially reduce the incidence of these infections. But the implementation of these practices requires strong commitment and dedication at the unit level; and this degree of commitment is unlikely to occur in an environment of secrecy.

In fact, I have not been able to discover any of the tools that are now available for measuring patient safety in hospitals in North America in application to hospitals in France. But without this regular reporting, there is no mechanism through which institutions with bad safety performance can be "ratcheted" up into better practices and better safety outcomes. The impression that is given in the French medical system is that the doctors and the medical authorities are sacrosanct; patients are not expected to question their judgment, and the state appears not to require institutions to report and publish fundamental safety information. Patients have very little power and the media so far seem to have paid little attention to the issues of patient safety in French hospitals. This 2007 article in LePoint seems to be a first for France in that it provides quantitative rankings of a large number of hospitals in their treatment of a number of diseases. But it does not provide the kinds of safety information -- infections, falls, pneumonias -- that are core measures of patient safety.

There is a French state agency, OFFICE NATIONAL D'INDEMNISATION DES ACCIDENTS MÉDICAUX (ONIAM), that provides compensation to patients who can demonstrate that their injuries are the result of hospital-induced causes, including especially hospital-associated infections. But it appears that this agency is restricted to after-the-fact recognition of hospital errors rather than pro-active programs designed to reduce hospital errors. And here is a French government web site devoted to the issue of hospital infections. It announces a multi-pronged strategy for controlling the problem of infections nosocomiales, including the establishment of a national program of surveillance of the rates of these infections. So far, however, I have not been able to locate web resources that would provide hospital-level data about infection rates.

So I am offering a hypothesis that I would be very happy to find to be refuted: that the French medical establishment continues to be bureaucratically administered with very little public exposure of actual performance when it comes to patient safety. And without this system of publicity, it seems very likely that there are wide and tragic variations across French hospitals with regard to patient safety.

Are there French medical sociologists and public health researchers who are working on the issue of patient safety in French hospitals? Can good contemporary French sociologists like Céline Béraud, Baptiste Coulmont, and Philippe Masson offer some guidance on this topic (post)? If readers are aware of databases and patient safety research programs in France that are relevant to these topics, I would be very happy to hear about them.

Update: Baptiste Coulmont (blog) passes on this link to Réseau d'alerte d'investigations et de surveillance des infections nosocomia (RAISIN) within the Institut de veille sanitaire. The site provides research reports and regional assessments of nosocomia incidence. It does not appear to provide data at the level of the specific hospitals and medical centers. Baptiste refers also to work by Jean Peneff, a French medical sociologist and author of La France malade de ses médecins. Here is a link to a subsequent research report by Peneff. Thanks, Baptiste.

Patient safety -- Canada and France


Patient safety is a key issue in managing and assessing a regional or national health system. There are very sizable variations in patient safety statistics across hospitals, with significantly higher rates of infection and mortality in some institutions than others. Why is this? And what can be done in order to improve the safety performance of low-safety institutions, and to improve the overall safety performance of the hospital environment nationally?

Previous posts have made the point that safety is the net effect of a complex system within a hospital or chemical plant, including institutions, rules, practices, training, supervision, and day-to-day behavior by staff and supervisors (post, post). And experts on hospital safety agree that improvements in safety require careful analysis of patient processes in order to redesign processes so as to make infections, falls, improper medications, and unnecessary mortality less likely. Institutional design and workplace culture have to change if safety performance is to improve consistently and sustainably. (Here is a posting providing a bit more discussion of the institutions of a hospital; post.)

But here is an important question: what are the features of the social and legal environment that will make it most likely that hospital administrators will commit themselves to a thorough-going culture and management of safety? What incentives or constraints need to exist to offset the impulses of cost-cutting and status quo management that threaten to undermine patient safety? What will drive the institutional change in a health system that improving patient safety requires?

Several measures seem clear. One is state regulation of hospitals. This exists in every state; but the effectiveness of regulatory regimes varies widely across context. So understanding the dynamics of regulation and enforcement is a crucial step to improving hospital quality and patient safety. The oversight of rigorous hospital accreditation agencies is another important factor for improvement. For example, the Joint Commission accredits thousands of hospitals in the United States (web page) through dozens of accreditation and certification programs. Patient safety is the highest priority underlying Joint Commission standards of accreditation. So regulation and the formulation of standards are part of the answer. But a particularly important policy tool for improving safety performance is the mandatory collection and publication of safety statistics, so that potential patients can decide between hospitals on the basis of their safety performance. Publicity and transparency are crucial parts of good management behavior; and secrecy is a refuge of poor performance in areas of public concern such as safety, corruption, or rule-setting. (See an earlier post on the relationship between publicity and corruption.)

But here we have a little bit of a conundrum: achieving mandatory publication of safety statistics is politically difficult, because hospitals have a business interest in keeping these data private. So there was a lot of resistance to mandatory reporting of basic patient safety data in the US over the past twenty years. Fortunately, the public interest in having these data readily available has largely prevailed, and hospitals are now required to publish a broader and broader range of data on patient safety, including hospital-induced infection rates, ventilator-induced pneumonias, patient falls, and mortality rates. Here is a useful tool from USA Today that lets the public and the patient gather information about his/her hospital options and how these compare with other hospitals regionally and nationally. This is an effective accountability mechanism that inevitably drives hospitals towards better performance.

Canada has been very active in this area. Here is a website published by the Ontario Ministry of Health and Long-Term Care. The province requires hospitals to report a number of factors that are good indicators of patient safety: several kinds of hospital-born infections; central-line primary bloodstream infection and ventilator-associated pneumonia; surgical-site infection prevention activity; and hospital-standardized mortality ratio. The user can explore the site and find that there are in fact wide variations across hospitals in the province. This is likely to change patient choice; but it also serves as an instant guide for regulatory agencies and local hospital administrators as they attempt to focus attention on poor management practices and institutional arrangements. (It would be helpful for the purpose of comparison if the data could be easily downloaded into a spreadsheet.)

On first principles, it seems likely that any country that has a hospital system in which the safety performance of each hospital is kept secret will also show a wide distribution of patient safety outcomes across institutions, and will have an overall safety record that is much lower than it could be. This is because secrecy gives hospital administrators the ability to conceal the risks their institutions impose on patients through bad practices. So publicity and regular publication of patient safety information seems to be a necessary precondition to maintaining a high-safety hospital system.

But here is the crucial point: many countries continue to permit secrecy when it comes to hospital safety. In particular, this seems to be true in France. It seems that the French medical and hospital system continues to display a very high degree of secrecy and opacity when it comes to patient safety. In fact, anecdotal information about French hospitals suggests a wide range of levels of hospital-born infections in different hospitals. Hospital-born infections (infections nosocomiales) are an important and rising cause of patient illness and morbidity. And there are well-known practices and technologies that substantially reduce the incidence of these infections. But the implementation of these practices requires strong commitment and dedication at the unit level; and this degree of commitment is unlikely to occur in an environment of secrecy.

In fact, I have not been able to discover any of the tools that are now available for measuring patient safety in hospitals in North America in application to hospitals in France. But without this regular reporting, there is no mechanism through which institutions with bad safety performance can be "ratcheted" up into better practices and better safety outcomes. The impression that is given in the French medical system is that the doctors and the medical authorities are sacrosanct; patients are not expected to question their judgment, and the state appears not to require institutions to report and publish fundamental safety information. Patients have very little power and the media so far seem to have paid little attention to the issues of patient safety in French hospitals. This 2007 article in LePoint seems to be a first for France in that it provides quantitative rankings of a large number of hospitals in their treatment of a number of diseases. But it does not provide the kinds of safety information -- infections, falls, pneumonias -- that are core measures of patient safety.

There is a French state agency, OFFICE NATIONAL D'INDEMNISATION DES ACCIDENTS MÉDICAUX (ONIAM), that provides compensation to patients who can demonstrate that their injuries are the result of hospital-induced causes, including especially hospital-associated infections. But it appears that this agency is restricted to after-the-fact recognition of hospital errors rather than pro-active programs designed to reduce hospital errors. And here is a French government web site devoted to the issue of hospital infections. It announces a multi-pronged strategy for controlling the problem of infections nosocomiales, including the establishment of a national program of surveillance of the rates of these infections. So far, however, I have not been able to locate web resources that would provide hospital-level data about infection rates.

So I am offering a hypothesis that I would be very happy to find to be refuted: that the French medical establishment continues to be bureaucratically administered with very little public exposure of actual performance when it comes to patient safety. And without this system of publicity, it seems very likely that there are wide and tragic variations across French hospitals with regard to patient safety.

Are there French medical sociologists and public health researchers who are working on the issue of patient safety in French hospitals? Can good contemporary French sociologists like Céline Béraud, Baptiste Coulmont, and Philippe Masson offer some guidance on this topic (post)? If readers are aware of databases and patient safety research programs in France that are relevant to these topics, I would be very happy to hear about them.

Update: Baptiste Coulmont (blog) passes on this link to Réseau d'alerte d'investigations et de surveillance des infections nosocomia (RAISIN) within the Institut de veille sanitaire. The site provides research reports and regional assessments of nosocomia incidence. It does not appear to provide data at the level of the specific hospitals and medical centers. Baptiste refers also to work by Jean Peneff, a French medical sociologist and author of La France malade de ses médecins. Here is a link to a subsequent research report by Peneff. Thanks, Baptiste.

Patient safety -- Canada and France


Patient safety is a key issue in managing and assessing a regional or national health system. There are very sizable variations in patient safety statistics across hospitals, with significantly higher rates of infection and mortality in some institutions than others. Why is this? And what can be done in order to improve the safety performance of low-safety institutions, and to improve the overall safety performance of the hospital environment nationally?

Previous posts have made the point that safety is the net effect of a complex system within a hospital or chemical plant, including institutions, rules, practices, training, supervision, and day-to-day behavior by staff and supervisors (post, post). And experts on hospital safety agree that improvements in safety require careful analysis of patient processes in order to redesign processes so as to make infections, falls, improper medications, and unnecessary mortality less likely. Institutional design and workplace culture have to change if safety performance is to improve consistently and sustainably. (Here is a posting providing a bit more discussion of the institutions of a hospital; post.)

But here is an important question: what are the features of the social and legal environment that will make it most likely that hospital administrators will commit themselves to a thorough-going culture and management of safety? What incentives or constraints need to exist to offset the impulses of cost-cutting and status quo management that threaten to undermine patient safety? What will drive the institutional change in a health system that improving patient safety requires?

Several measures seem clear. One is state regulation of hospitals. This exists in every state; but the effectiveness of regulatory regimes varies widely across context. So understanding the dynamics of regulation and enforcement is a crucial step to improving hospital quality and patient safety. The oversight of rigorous hospital accreditation agencies is another important factor for improvement. For example, the Joint Commission accredits thousands of hospitals in the United States (web page) through dozens of accreditation and certification programs. Patient safety is the highest priority underlying Joint Commission standards of accreditation. So regulation and the formulation of standards are part of the answer. But a particularly important policy tool for improving safety performance is the mandatory collection and publication of safety statistics, so that potential patients can decide between hospitals on the basis of their safety performance. Publicity and transparency are crucial parts of good management behavior; and secrecy is a refuge of poor performance in areas of public concern such as safety, corruption, or rule-setting. (See an earlier post on the relationship between publicity and corruption.)

But here we have a little bit of a conundrum: achieving mandatory publication of safety statistics is politically difficult, because hospitals have a business interest in keeping these data private. So there was a lot of resistance to mandatory reporting of basic patient safety data in the US over the past twenty years. Fortunately, the public interest in having these data readily available has largely prevailed, and hospitals are now required to publish a broader and broader range of data on patient safety, including hospital-induced infection rates, ventilator-induced pneumonias, patient falls, and mortality rates. Here is a useful tool from USA Today that lets the public and the patient gather information about his/her hospital options and how these compare with other hospitals regionally and nationally. This is an effective accountability mechanism that inevitably drives hospitals towards better performance.

Canada has been very active in this area. Here is a website published by the Ontario Ministry of Health and Long-Term Care. The province requires hospitals to report a number of factors that are good indicators of patient safety: several kinds of hospital-born infections; central-line primary bloodstream infection and ventilator-associated pneumonia; surgical-site infection prevention activity; and hospital-standardized mortality ratio. The user can explore the site and find that there are in fact wide variations across hospitals in the province. This is likely to change patient choice; but it also serves as an instant guide for regulatory agencies and local hospital administrators as they attempt to focus attention on poor management practices and institutional arrangements. (It would be helpful for the purpose of comparison if the data could be easily downloaded into a spreadsheet.)

On first principles, it seems likely that any country that has a hospital system in which the safety performance of each hospital is kept secret will also show a wide distribution of patient safety outcomes across institutions, and will have an overall safety record that is much lower than it could be. This is because secrecy gives hospital administrators the ability to conceal the risks their institutions impose on patients through bad practices. So publicity and regular publication of patient safety information seems to be a necessary precondition to maintaining a high-safety hospital system.

But here is the crucial point: many countries continue to permit secrecy when it comes to hospital safety. In particular, this seems to be true in France. It seems that the French medical and hospital system continues to display a very high degree of secrecy and opacity when it comes to patient safety. In fact, anecdotal information about French hospitals suggests a wide range of levels of hospital-born infections in different hospitals. Hospital-born infections (infections nosocomiales) are an important and rising cause of patient illness and morbidity. And there are well-known practices and technologies that substantially reduce the incidence of these infections. But the implementation of these practices requires strong commitment and dedication at the unit level; and this degree of commitment is unlikely to occur in an environment of secrecy.

In fact, I have not been able to discover any of the tools that are now available for measuring patient safety in hospitals in North America in application to hospitals in France. But without this regular reporting, there is no mechanism through which institutions with bad safety performance can be "ratcheted" up into better practices and better safety outcomes. The impression that is given in the French medical system is that the doctors and the medical authorities are sacrosanct; patients are not expected to question their judgment, and the state appears not to require institutions to report and publish fundamental safety information. Patients have very little power and the media so far seem to have paid little attention to the issues of patient safety in French hospitals. This 2007 article in LePoint seems to be a first for France in that it provides quantitative rankings of a large number of hospitals in their treatment of a number of diseases. But it does not provide the kinds of safety information -- infections, falls, pneumonias -- that are core measures of patient safety.

There is a French state agency, OFFICE NATIONAL D'INDEMNISATION DES ACCIDENTS MÉDICAUX (ONIAM), that provides compensation to patients who can demonstrate that their injuries are the result of hospital-induced causes, including especially hospital-associated infections. But it appears that this agency is restricted to after-the-fact recognition of hospital errors rather than pro-active programs designed to reduce hospital errors. And here is a French government web site devoted to the issue of hospital infections. It announces a multi-pronged strategy for controlling the problem of infections nosocomiales, including the establishment of a national program of surveillance of the rates of these infections. So far, however, I have not been able to locate web resources that would provide hospital-level data about infection rates.

So I am offering a hypothesis that I would be very happy to find to be refuted: that the French medical establishment continues to be bureaucratically administered with very little public exposure of actual performance when it comes to patient safety. And without this system of publicity, it seems very likely that there are wide and tragic variations across French hospitals with regard to patient safety.

Are there French medical sociologists and public health researchers who are working on the issue of patient safety in French hospitals? Can good contemporary French sociologists like Céline Béraud, Baptiste Coulmont, and Philippe Masson offer some guidance on this topic (post)? If readers are aware of databases and patient safety research programs in France that are relevant to these topics, I would be very happy to hear about them.

Update: Baptiste Coulmont (blog) passes on this link to Réseau d'alerte d'investigations et de surveillance des infections nosocomia (RAISIN) within the Institut de veille sanitaire. The site provides research reports and regional assessments of nosocomia incidence. It does not appear to provide data at the level of the specific hospitals and medical centers. Baptiste refers also to work by Jean Peneff, a French medical sociologist and author of La France malade de ses médecins. Here is a link to a subsequent research report by Peneff. Thanks, Baptiste.

Monday, February 9, 2009

Institutions, procedures, norms


One of the noteworthy aspects of the framing offered by Victor Nee and Mary Brinton of the assumptions of the new institutionalism is the very close connection they postulate between institutions and norms. (See the prior posting on this subject). So what is the connection between institutions and norms?

The idea that an institution is nothing more than a collections of norms, formal and informal, seems incomplete on its face. Institutions also depend on rules, procedures, protocols, sanctions, and habits and practices. These other social behavioral factors perhaps intersect in various ways with the workings of social norms, but they are not reducible to a set of norms. And this is to say that institutions are not reducible to a collection of norms.

Consider for example the institutions that embody the patient safety regime in a hospital. What are the constituents of the institutions through which hospitals provide for patient safety? Certainly there are norms, both formal and informal, that are deliberately inculcated and reinforced and that influence the behavior of nurses, pharmacists, technicians, and doctors. But there are also procedures -- checklists in operating rooms; training programs -- rehearsals of complex crisis activities; routinized behaviors -- "always confirm the patient's birthday before initiating a procedure"; and rules -- "physicians must disclose financial relationships with suppliers". So the institutions defining the management of patient safety are a heterogeneous mix of social factors and processes.

A key feature of an institution, then, is the set of procedures and protocols that it embodies. In fact, we might consider a short-hand way of specifying an institution in terms of the set of procedures it specifies for behavior in stereotyped circumstances of crisis, conflict, cooperation, and mundane interactions with stakeholders. Organizations have usually created specific ways of handling typical situations: handling an intoxicated customer in a restaurant, making sure that no "wrong site" surgeries occur in an operating room, handling the flow of emergency supplies into a region when a large disaster occurs. The idea here is that the performance of the organization, and the individuals within it, will be more effective at achieving the desired goals of the organization if plans and procedures have been developed to coordinate actions in the most effective way possible. This is the purpose of an airline pilot's checklist before takeoff; it forces the pilot to go through a complete procedure that has been developed for the purpose of avoiding mistakes. Spontaneous, improvised action is sometimes unavoidable; but organizations have learned that they are more effective when they thoughtfully develop procedures for handling their high-risk activities.

This is the point at which the categories of management oversight and staff training come into play. It is one thing to have designed an effective set of procedures for handling a given complex task; but this achievement is only genuinely effective if agents within the organization in fact follow the procedures and protocols. Training is the umbrella activity that describes the processes through which the organization attempts to achieve a high level of shared knowledge about the organization's procedures. And management oversight is the umbrella activity that describes the processes of supervision and motivation through which the organization attempts to ensure that its agents follow the procedures and protocols.

In fact, one of the central findings in the area of safety research is that the specific content of the procedures of an organization that engages in high-risk activities is crucially important to the overall safety performance of the organization. Apparently small differences in procedure can have an important effect on safety. To take a fairly trivial example, the construction of a stylized vocabulary and syntax for air traffic controllers and pilots increases safety by reducing the possibility of ambiguous communications; so two air traffic systems that were identical except with respect to the issue of standardized communications protocols will be expected to have different safety records. Another key finding falls more on the "norms and culture" side of the equation; it is frequently observed that high-risk organizations need to embody a culture of safety that permeates the whole organization.

We might postulate that norms come into the story when we get to the point of asking what motivates a person to conform to the prescribed procedure or rule -- though there are several other social-behavioral mechanisms that work at this level as well (trained habits, well enforced sanctions, for example). But more fundamentally, the explanatory value of the micro-institutional analysis may come in at the level of the details of the procedures and rules in contrast to other possible embodiments -- rather than at the level of the question, what makes these procedures effective in most participants' conduct?

We might say, then, that an institution can be fully specified when we provide information about:
  • the procedures, policies, and protocols it imposes on its participants
  • the training and educational processes the institution relies on for instilling appropriate knowledge about its procedures and rules in its participants
  • the management, supervision, enforcement, and incentive mechanisms it embodies to assure a sufficient level of compliance among its participants
  • the norms of behavior that typical participants have internalized with respect to action within the institution
And the distinctive performance characteristics of the institution may derive from the specific nature of the arrangements that are described at each of these levels.

System safety is a good example to consider from the point of view of the new institutionalism. Two airlines may have significantly different safety records. And the explanation may be at any of these levels: they may have differences in formalized procedures, they may have differences in training regimes, they may have differences in management oversight effectiveness, or they may have different normative cultures at the rank-and-file level. It is a central insight of the new institutionalism that the first level may be the most important for explaining the overall safety records of the two companies, even though mechanisms may fail at any of the other levels as well. Procedural differences generally lead to significant and measurable differences in the quality of organizational results. (Nancy Leveson's Safeware: System Safety and Computers provides a great discussion of many of these issues.)

Institutions, procedures, norms


One of the noteworthy aspects of the framing offered by Victor Nee and Mary Brinton of the assumptions of the new institutionalism is the very close connection they postulate between institutions and norms. (See the prior posting on this subject). So what is the connection between institutions and norms?

The idea that an institution is nothing more than a collections of norms, formal and informal, seems incomplete on its face. Institutions also depend on rules, procedures, protocols, sanctions, and habits and practices. These other social behavioral factors perhaps intersect in various ways with the workings of social norms, but they are not reducible to a set of norms. And this is to say that institutions are not reducible to a collection of norms.

Consider for example the institutions that embody the patient safety regime in a hospital. What are the constituents of the institutions through which hospitals provide for patient safety? Certainly there are norms, both formal and informal, that are deliberately inculcated and reinforced and that influence the behavior of nurses, pharmacists, technicians, and doctors. But there are also procedures -- checklists in operating rooms; training programs -- rehearsals of complex crisis activities; routinized behaviors -- "always confirm the patient's birthday before initiating a procedure"; and rules -- "physicians must disclose financial relationships with suppliers". So the institutions defining the management of patient safety are a heterogeneous mix of social factors and processes.

A key feature of an institution, then, is the set of procedures and protocols that it embodies. In fact, we might consider a short-hand way of specifying an institution in terms of the set of procedures it specifies for behavior in stereotyped circumstances of crisis, conflict, cooperation, and mundane interactions with stakeholders. Organizations have usually created specific ways of handling typical situations: handling an intoxicated customer in a restaurant, making sure that no "wrong site" surgeries occur in an operating room, handling the flow of emergency supplies into a region when a large disaster occurs. The idea here is that the performance of the organization, and the individuals within it, will be more effective at achieving the desired goals of the organization if plans and procedures have been developed to coordinate actions in the most effective way possible. This is the purpose of an airline pilot's checklist before takeoff; it forces the pilot to go through a complete procedure that has been developed for the purpose of avoiding mistakes. Spontaneous, improvised action is sometimes unavoidable; but organizations have learned that they are more effective when they thoughtfully develop procedures for handling their high-risk activities.

This is the point at which the categories of management oversight and staff training come into play. It is one thing to have designed an effective set of procedures for handling a given complex task; but this achievement is only genuinely effective if agents within the organization in fact follow the procedures and protocols. Training is the umbrella activity that describes the processes through which the organization attempts to achieve a high level of shared knowledge about the organization's procedures. And management oversight is the umbrella activity that describes the processes of supervision and motivation through which the organization attempts to ensure that its agents follow the procedures and protocols.

In fact, one of the central findings in the area of safety research is that the specific content of the procedures of an organization that engages in high-risk activities is crucially important to the overall safety performance of the organization. Apparently small differences in procedure can have an important effect on safety. To take a fairly trivial example, the construction of a stylized vocabulary and syntax for air traffic controllers and pilots increases safety by reducing the possibility of ambiguous communications; so two air traffic systems that were identical except with respect to the issue of standardized communications protocols will be expected to have different safety records. Another key finding falls more on the "norms and culture" side of the equation; it is frequently observed that high-risk organizations need to embody a culture of safety that permeates the whole organization.

We might postulate that norms come into the story when we get to the point of asking what motivates a person to conform to the prescribed procedure or rule -- though there are several other social-behavioral mechanisms that work at this level as well (trained habits, well enforced sanctions, for example). But more fundamentally, the explanatory value of the micro-institutional analysis may come in at the level of the details of the procedures and rules in contrast to other possible embodiments -- rather than at the level of the question, what makes these procedures effective in most participants' conduct?

We might say, then, that an institution can be fully specified when we provide information about:
  • the procedures, policies, and protocols it imposes on its participants
  • the training and educational processes the institution relies on for instilling appropriate knowledge about its procedures and rules in its participants
  • the management, supervision, enforcement, and incentive mechanisms it embodies to assure a sufficient level of compliance among its participants
  • the norms of behavior that typical participants have internalized with respect to action within the institution
And the distinctive performance characteristics of the institution may derive from the specific nature of the arrangements that are described at each of these levels.

System safety is a good example to consider from the point of view of the new institutionalism. Two airlines may have significantly different safety records. And the explanation may be at any of these levels: they may have differences in formalized procedures, they may have differences in training regimes, they may have differences in management oversight effectiveness, or they may have different normative cultures at the rank-and-file level. It is a central insight of the new institutionalism that the first level may be the most important for explaining the overall safety records of the two companies, even though mechanisms may fail at any of the other levels as well. Procedural differences generally lead to significant and measurable differences in the quality of organizational results. (Nancy Leveson's Safeware: System Safety and Computers provides a great discussion of many of these issues.)

Institutions, procedures, norms


One of the noteworthy aspects of the framing offered by Victor Nee and Mary Brinton of the assumptions of the new institutionalism is the very close connection they postulate between institutions and norms. (See the prior posting on this subject). So what is the connection between institutions and norms?

The idea that an institution is nothing more than a collections of norms, formal and informal, seems incomplete on its face. Institutions also depend on rules, procedures, protocols, sanctions, and habits and practices. These other social behavioral factors perhaps intersect in various ways with the workings of social norms, but they are not reducible to a set of norms. And this is to say that institutions are not reducible to a collection of norms.

Consider for example the institutions that embody the patient safety regime in a hospital. What are the constituents of the institutions through which hospitals provide for patient safety? Certainly there are norms, both formal and informal, that are deliberately inculcated and reinforced and that influence the behavior of nurses, pharmacists, technicians, and doctors. But there are also procedures -- checklists in operating rooms; training programs -- rehearsals of complex crisis activities; routinized behaviors -- "always confirm the patient's birthday before initiating a procedure"; and rules -- "physicians must disclose financial relationships with suppliers". So the institutions defining the management of patient safety are a heterogeneous mix of social factors and processes.

A key feature of an institution, then, is the set of procedures and protocols that it embodies. In fact, we might consider a short-hand way of specifying an institution in terms of the set of procedures it specifies for behavior in stereotyped circumstances of crisis, conflict, cooperation, and mundane interactions with stakeholders. Organizations have usually created specific ways of handling typical situations: handling an intoxicated customer in a restaurant, making sure that no "wrong site" surgeries occur in an operating room, handling the flow of emergency supplies into a region when a large disaster occurs. The idea here is that the performance of the organization, and the individuals within it, will be more effective at achieving the desired goals of the organization if plans and procedures have been developed to coordinate actions in the most effective way possible. This is the purpose of an airline pilot's checklist before takeoff; it forces the pilot to go through a complete procedure that has been developed for the purpose of avoiding mistakes. Spontaneous, improvised action is sometimes unavoidable; but organizations have learned that they are more effective when they thoughtfully develop procedures for handling their high-risk activities.

This is the point at which the categories of management oversight and staff training come into play. It is one thing to have designed an effective set of procedures for handling a given complex task; but this achievement is only genuinely effective if agents within the organization in fact follow the procedures and protocols. Training is the umbrella activity that describes the processes through which the organization attempts to achieve a high level of shared knowledge about the organization's procedures. And management oversight is the umbrella activity that describes the processes of supervision and motivation through which the organization attempts to ensure that its agents follow the procedures and protocols.

In fact, one of the central findings in the area of safety research is that the specific content of the procedures of an organization that engages in high-risk activities is crucially important to the overall safety performance of the organization. Apparently small differences in procedure can have an important effect on safety. To take a fairly trivial example, the construction of a stylized vocabulary and syntax for air traffic controllers and pilots increases safety by reducing the possibility of ambiguous communications; so two air traffic systems that were identical except with respect to the issue of standardized communications protocols will be expected to have different safety records. Another key finding falls more on the "norms and culture" side of the equation; it is frequently observed that high-risk organizations need to embody a culture of safety that permeates the whole organization.

We might postulate that norms come into the story when we get to the point of asking what motivates a person to conform to the prescribed procedure or rule -- though there are several other social-behavioral mechanisms that work at this level as well (trained habits, well enforced sanctions, for example). But more fundamentally, the explanatory value of the micro-institutional analysis may come in at the level of the details of the procedures and rules in contrast to other possible embodiments -- rather than at the level of the question, what makes these procedures effective in most participants' conduct?

We might say, then, that an institution can be fully specified when we provide information about:
  • the procedures, policies, and protocols it imposes on its participants
  • the training and educational processes the institution relies on for instilling appropriate knowledge about its procedures and rules in its participants
  • the management, supervision, enforcement, and incentive mechanisms it embodies to assure a sufficient level of compliance among its participants
  • the norms of behavior that typical participants have internalized with respect to action within the institution
And the distinctive performance characteristics of the institution may derive from the specific nature of the arrangements that are described at each of these levels.

System safety is a good example to consider from the point of view of the new institutionalism. Two airlines may have significantly different safety records. And the explanation may be at any of these levels: they may have differences in formalized procedures, they may have differences in training regimes, they may have differences in management oversight effectiveness, or they may have different normative cultures at the rank-and-file level. It is a central insight of the new institutionalism that the first level may be the most important for explaining the overall safety records of the two companies, even though mechanisms may fail at any of the other levels as well. Procedural differences generally lead to significant and measurable differences in the quality of organizational results. (Nancy Leveson's Safeware: System Safety and Computers provides a great discussion of many of these issues.)