Wednesday, March 16, 2011

Why are we surprised at surprises?

We are all guilty - - people form habits and routines and the habits and routines help them see or equally, make them blind.  At the organizational level our blindness is a little different - - there are a lot of systems in place, but we don't spend a lot of time understanding how these systems work or interconnect.  We tend to see only the tip of the iceberg.  It's when you go below the iceberg you begin to really see the factors that drive events, that give causal  explanation behind the surprise.  In order words, as we have seen on the east coast of Japan, we're surprised by some events that arguably should be foreseeable.

Engineers may well accomplish self-consistent design, but its outcome, the artifact can never be perfect in operation.  Neither humans working at the front end (e.g., operators and maintenance personnel), nor humans working at the back end (e.g., administrators and regulators) are perfect.  The system (i.e., the combination of artifact and humans with various responsibilities) therefore cannot be perfect.  It is not only the variability of human performance but also the human propensity for pursuing perfection with prescriptive rules while ceaselessly tyring to change the system for better, that makes the system incomplete and imperfect as the time passes by.

The Japanese nuclear reactor(s) accident demonstrates that safety is not a system property.  By this I mean that safety is something a system or an organization does, rather than something a system or organization has.  When you look at the pictures on CNN, remember that safety is not a system property that, once having been put in place, will remain.  It is rather a characteristic of how a system performs.  This creates the dilemma that safety is shown more by the absence of certain events - - namely accidents - - than by the presence of something.  Indeed, the occurrence of a unwanted event need not mean that safety as such failed, but could equally well be due to the fact that safety is never complete or absolute.

A system is in control if it is able to minimize or eliminate unwanted variability, either in its own performance, in the environment, or both.  The link between loss of control and the occurrence of unexpected events is so tight that a preponderance of the latter in practice is a signature of the former.  The loss of control is nevertheless not a necessary condition for unexpected events to occur.  They may be due to other factors, causes and developments outside the boundaries of the system.

It is a universal experience that things sooner or later will go wrong, and fields such as risk analysis and human reliability assessment have developed a plethora of methods to help us predict when and how it may happen.  When conditions are stable, it's easier to live with momentum and the projections we normally use.  With the potential of extreme weather events caused by climate change and the interconnectedness of our critical infrastructure systems - - the certainties of our condition have the potential to become more uncertain.  We need to start thinking about the unthinkable scenarios - - those "black swan" events and "flying cow" problems.  The challenge is that when things are very uncertain, we need to think differently because what we project based on current momentum many be the least likely outcome.  Both scenario planning and resilience engineering help point the way ahead. 

Resilience requires a constant sense of unease that prevents engineering and organizational complacency.  It requires a realistic sense of abilities, of "what we are".  It requires knowledge of what has happened, what happens, and what will happen, as well as what to do.  A resilient systems must be proactive; flexible; adaptive; and prepared.  It must be aware of the impact of actions, as well as of the failure to take action.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.