|
Read the Executive Summary
|
CREATING A RISK CULTURE IN AN IT ENVIRONMENT
by Carl Pritchard, Senior Consultant, Cutter Consortium
INTRODUCTION
Despite the negative cachet of instituting an enterprise risk management and governance (ERM&G) strategy, risk practices have become increasingly important for IT organizations. Because risk management involves informing others of problems, there is a tendency to kill the messenger (that is, the risk manager) and, moreover, to undervalue the importance of a preemptive risk approach. Given mandatory governance legislation such as the US Sarbanes-Oxley Act, as well as the emergence of myriad new security threats, organizations cannot view risk management as optional. But introducing a risk management culture can be disruptive to prevailing IT practices, so risk managers and others tasked with implementing a company's risk strategy must challenge cultural norms and sometimes overcome institutional resistance. In order to establish a healthy organizational risk culture, the language and components of the culture must be defined and implemented with consistency, and risk management practices must be integrated into the day-to-day environment. This Executive Report discusses the essential changes in organizational culture that must take place to ensure a successful ERM&G strategy.
UNDERSTANDING GROUP CULTURE
Every community has a culture. Whenever groups form, in the aggregate they create a culture. While we tend to think of culture as something developed on a regional or national basis, most cultures actually exist in a more microcosmic environment. Companies and departments have a culture, as do those in a given profession. In the medical community, for example, norms have shifted over the past few decades from house calls and the family doctor to group practices and office visits. Physicians who violate those norms run the risk of being ostracized (or viewed as a bit odd). Similarly, IT organizations that exist within larger corporations tend to have their own culture. The stereotypes associated with the IT environment are often rooted in these cultures rather than in the individuals themselves. And because organizations as a whole tend toward insularity [7], departments that make up those organizations exhibit similar behavior.
All too often within IT organizations and cultures, certain management functions exist on the fringes of the culture. The management function is viewed as an oversight or controlling responsibility. When charged with the governance of the creative, highly technical environment of IT, the role of management is perceived as divorced from the prevailing culture. As a result, the Dilbert comic strip stereotype of pointy-haired bosses who are completely out of touch flourishes. This perception compounds the problems associated with implementing any management initiative, particularly those that are perceived as administrative or punitive. Unfortunately, risk management is at or near the top of such a list.
Because risk strategies represent the effort to identify and contend with the potential for bad events, the outlook of risk management tends to be pessimistic and depressing. In addition, risk management can fly in the face of IT organizations' many reward-and-recognition structures. Organizations frequently acknowledge the performance of their employee "stars" who work miracles burning the midnight oil to preserve a client relationship or repair a sinking program. By contrast, individuals who simply prevent negative events from occurring (or who preplan to such an extent that management intervention isn't necessary) rarely receive acknowledgment. Firefighters get awards; fire prevention experts are largely ignored.
In order to establish a healthy organizational risk culture, the language and components of the culture must be defined and established, and risk management practices must become part of the day-to-day environment. No matter the efficacy of general management practices, risk practices will not evolve independently. They must be seeded, nurtured, and maintained.
DEFINING AN ORGANIZATION'S RISK
Before embarking on the creation of a risk culture, the existing risk environment must be evaluated. To evaluate that environment, the questions to ask are relatively simple to answer:
- When a team member identifies a significant
potential problem, does he or she know what to do
next?
- When senior management identifies a significant
potential problem, how is it dealt with?
- Who determines what constitutes a "significant"
event?
- Who determines which resolution approach will be
applied?
- How are these approaches implemented?
For organizations without a risk culture, the answers to all these questions are highly situational. If the response to three or more of these questions is "It must be determined on a case-by-case basis," the organization has no existing risk culture. This is not to say that the organization doesn't apply risk management practices, but it does indicate that it applies its practices on an ad hoc basis without serious consideration for the inconsistency that arises from doing so. And if the answer to all these questions rests on the shoulders of only one person, the risk culture is defined solely by that individual; should anything happen to him or her, the risk infrastructure will ultimately collapse.
To create a culture that is truly prepared for risk management, the organization should assess its existing culture in terms of the elements that are in place and those that must be added or augmented to ensure consistency in the assessment of, and response to, risk. The simplest way to assess the existing risk culture is by questioning IT personnel at various levels on how they would handle or manage a specific risk event. Identify a handful of representative scenarios and ask IT staff members these questions consistently. Consider the following:
- You just learned that a vendor might not be able
to deliver (which would put your company in default
with a client). What are your next steps?
- You just discovered there's a chance that the
Einstein of your group might leave for a better
position. What do you do?
- The customer just called and hinted that it is
considering purchasing an upgrade product from the
competition and wants your professional opinion on
the interface. What is the next step?
- You're in the process of an internal audit for
International Organization for Standardization (ISO)
certification and you learn that the auditor hates to
hear complaints about the time that audits take. What
do you do?
- You just learned that the Securities and Exchange
Commission (SEC) is beginning a series of audits on
firms just like yours. What do you do next?
Answers to questions like these reveal characteristics of the risk culture because they indicate whether the organization is effectively prepared to manage risk related to a variety of areas. They also reveal whether different individuals manage the same risks in the same way. The answers to these questions indicate the level of consistency common to the organization. They also provide a framework or context that is far more powerful and in-depth than simply asking, "How will you handle risk processes?"
ESSENTIAL ELEMENTS OF A RISK CULTURE
To know what's missing from an organization's risk culture, one must first determine which elements are required. If, for example, an organization has the financial wherewithal to withstand virtually any risk, then setting up cost thresholds need not be considered a missing element. Next, one must identify which elements are required. Most organizations' risk cultures are made up of consistent terminology, thresholds, triggers, mandatory practices, and controls. Within each of these groups, a subset of practices determines how the elements will be implemented.
Consistent Terminology and Risk Language
To identify whether a risk language exists within an organization, the test would involve terminology. Can staff members define "risk" consistently? Can they explain the organization's risk strategies? Can they define risk processes in consistent terms? Can they tell you the difference between the likelihood and impact of these risks? If not, a common risk language may not exist. This doesn't mean that staff members don't know how to discuss organizational risk, but rather that they are unable to use consistent terminology and are unable to consistently document the risks the organization faces as well as those risks that are truly significant.
Tolerances, Thresholds, and Triggers
Tolerances are the limits of organizational behavior. They are the bridges an organization collectively will not cross.
Thresholds are perhaps the most common and easily established component of a risk culture. They answer the question "How far will the organization go?" Most organizations have limits on costs, schedules, employee behavior, cultural awareness, community involvement, and a host of other considerations. Oddly enough, these limits are rarely publicized and are instead passed from employee to employee as part of an organization's oral tradition. New hires learn the risk culture by virtue of negative experiences. By encountering a problem and being chastised for it, new employees are inducted into the culture. The lack of documented thresholds raises two concerns: first, thresholds may not be communicated throughout the organization; and second, thresholds, expressed as a consistent set of values, may not exist.
To determine whether staff members are aware of and understand their organization's thresholds, ask various team members about the organization's limits. Which behaviors would and would not be accepted? What extent of a behavior is required for an employee to gain management attention or to be fired? What levels of expense are required to expedite an issue through the management hierarchy or to prompt management reporting? If staff members cannot provide an answer, the thresholds do not formally exist. If employees provide inconsistent answers, thresholds are not effectively communicated enterprise-wide.
Triggers are the specific warning signs that indicate a threshold is about to be breached.
Mandatory Processes
Mandatory process is the cultural element easiest to identify as missing. Just as time sheets or time reports must be turned in on a given day, certain processes are required; they are the essentials of doing business. Mandatory processes are important because few people have the authority to circumvent them. When it comes to risk, these steps are the "ounce of prevention" in place to ensure that certain risks are dealt with in certain ways. They also ensure communication among the involved parties when it comes to risk.
Risk Reporting and Controls
Everyone in an organization must follow certain behaviors. Governmentally mandated behaviors are normally universal, as are particular administrative practices dictated by influences external to an organization (such as health insurers and certain large-scale clients). Note that consistent behaviors tend to stem from external rather than internal forces and demands. There is more internal structure in some organizational cultures than in others. And in best practice risk organizations, internal risk management practice involves a measure of structure. If these practices are in place, consistent behavior will prevail, such as notifying management about potential problems and documenting information about those concerns. If these practices are not in place, ad hoc determinations will define when risks have (or have not) escalated.
Risk controls are the "thermostats" of the risk management process. A thermostat is put in place to ensure that the temperature of a facility is relatively consistent without requiring much individual attention to the controls -- a "set it and forget it" attitude. In a perfect corporate world, the same would apply to risk controls. Once these controls are in place, individuals can simply wait until a risk potential becomes too great and the systems activate to address the problem at the appropriate level. As discussed previously, in order for such a failsafe process to function, thresholds must be in place, and the response to those thresholds should be consistent. Moreover, an organization can have risk thresholds without having effective risk controls, but not the reverse. Without risk thresholds, controls cannot exist. If there are thresholds but inconsistent response to them, then the controls do not exist. Controls exist only in environments with clear risk thresholds to which everyone responds in a consistent fashion.
In a world where risk controls are operating effectively, staff members can define risk terms consistently, identify common organizational risk strategies, and identify the amount of risk the organization can tolerate. In addition, staff can consistently follow the same processes and predict organizational risk response behaviors.
RISK ELEMENT INCULCATION AND IMPLEMENTATION
The Language of Risk and Consistent Terminology
As established previously, a significant component of any culture is language. The language of risk is particularly challenging, since most business cultures are risk-averse and therefore reticent when it comes to taking on risk. To acknowledge risks, some contend, is to invite them and to create a self-fulfilling prophecy. But like Webster's Dictionary -- the first American dictionary of significance, which was largely responsible for creating a shift in American culture away from British culture -- an organization must create a language that is distinct from existing terminology in order to develop a new culture [11].
Whenever a risk culture becomes an organizational imperative, establishing a common language is one of the first priorities so that those who serve the organization can talk about it intelligently. Classic examples can be found in US Government Accountability Office (GAO) reports on various types of risk. In page after page of its 2001 report to Congress on Chemical Risk Assessment, GAO defines "risk-related tasks" within different agencies and clarifies cultural assumptions from one agency to the next [5]. Further, GAO clarifies day-to-day terms such as "assessment" and "hazardous" that can take on different dimensions when risk is at the heart of the discussion.
In order to explore the organization's capacity to speak the same risk language, ask different staff members the same set of questions:
- What risk does this organization regularly
confront?
- Is there a way to avoid this risk? If so,
how?
- Who is responsible for the presence of this
risk?
Some keywords in these questions drive different responses. Those words that will be interpreted in different ways by different people are "risk," "avoid," and "responsible." They are classic examples of basic terminology that should be organizationally consistent but instead is often subject to much individual interpretation.
In addition, simple terms often carry multiple meanings. For example, the term "risk" means different things to different people. For some, it refers to specific events that may happen to the detriment of an organization. It is a future phenomenon that has not yet occurred, as in, "We face the risk of being hit with another civil lawsuit, bringing down our stock price." In this case, a better term might be "risk event." For others, it refers to any prevailing issue that causes pain and anguish for the organization, as in, "We face risk from our understaffing." This usage actually reflects a common misconception about risk: that is, concerns that already cause pain and anguish are not risks at all. They are issues; they exist. We have no need to wait for the future to see whether they occur. They are in play. For some, risk refers strictly to the financial. For others, it's only the dark unknown.
Consider the remarks of US Defense Secretary Donald H. Rumsfeld following a 2003 Department of Defense meeting:
Reports that say that something hasn't happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know. [4]
The media lambasted Rumsfeld for what was perceived as unintelligible speech. But in some risk cultures, particularly that of the Defense Department, Rumsfeld's assessment was on the mark. In its parlance, the known knowns are the risks for which there is a degree of awareness sufficient to establish probability and impact. These risks can be dealt with through a clear strategy. The known unknowns often carry a high degree of uncertainty in that there is limited information available about the likelihood of occurrence but a clear awareness that they're there. The unknown unknowns are risks for which there is no awareness or information. They are the proverbial "bolts from the blue." While the media had difficulty interpreting Rumsfeld's remarks, his peers in the military should have known precisely what he was talking about.
The military has a language for risk and understands that when staff members identify a "risk," they zero in on a future phenomenon for which there is a degree of impact and a level of probability or uncertainty. The term has meaning. It allows for more easily facilitated internal communication and opens the door for a better understanding of how risks will ultimately be managed.
Creating a risk language starts with simple clarification. When terms are used at meetings and briefings, an organizational dictionary can be generated. By asking, "Is that a risk or an issue?" the discussion can begin on the distinctions between the two and whether the organization is trying to manage risk or address existing environmental concerns. If these distinctions are made frequently enough, the language becomes integral to the organization. To expedite this integration, one should apply the classic language practice of glossary and acronym tables. Further, it should be standard for risk-related reports to include a simple glossary of terms as an appendix, which opens the door for unified understanding. Like Noah Webster, the creator of Webster's Dictionary, the author of the appendix has an amazing opportunity to influence organizational culture for years to come.
Tolerances, Thresholds, and Triggers
Risk thresholds present a specific area where risk language can vary wildly even within organizations. Thresholds can refer to two things: tolerances and thresholds. Tolerances are the limits of organizational behavior. They are the bridges an organization collectively will not cross. Thresholds, by contrast, are the points at which certain behaviors should occur in order to address risk appropriately within the organization.
The absence of tolerances and thresholds is evident whenever a staff member gets into trouble for failing to notify management of a particular concern. "I wish you had told me about this" is a common office complaint. And the response "I didn't think it was important" is clear evidence of a failure to establish tolerances and thresholds that provide both guidance and a measure of autonomy at a variety of levels.
To establish risk tolerances, the operative question is, "What can the organization withstand in terms of the following factors?"
- The organization's self-image and public
relations strategy
- The organization's safety and security
- The organization's environment
- Public health concerns
- Spending concerns
- Schedule issues and delays
- Contractual relationships
- Legal action
- Any other area of sensitivity for the
organization
Setting risk tolerances is frequently no more involved than simply referring back to the last time organizational practices ground to a halt while major recovery and reviews were conducted. What forced the organization into this mode? How could staff members have identified that this situation was coming?
The answers to these two questions create clear risk tolerances. Some organizations have instituted policies to address these issues. For example, some organizations will not tolerate negative media exposure and, as a result, have a low tolerance for media communications. They instruct staff not to talk with the press or to engage in any public activities without the explicit consent of the communications office. For individuals who have worked in such an environment for some time, this policy just seems like common sense. Having a wary attitude toward media and other public sources seems so normal that these employees would never answer a press question without clearing it with the communications office. For a newcomer, however, such an environment may be totally alien. He or she may be accustomed to frequent, comfortable contact with peers in the media. If the risk tolerance is well publicized within the organization, a newcomer will know to change his or her behavior; if the tolerance is learned only through experience, however, he or she may have to make a significant mistake before understanding the organization's risk tolerance. While learning through discovery can make a long-lasting impact, so can a violation of an organization's policies.
Thresholds are a close cousin of tolerances in that tolerances help establish the thresholds beyond which an organization will rarely proceed. Thresholds are the warning track of risk management and the point at which specific behaviors are applied to ensure that the risk culture of the organization is not violated. If the organization has no tolerance for media exposure, thresholds may be established to ensure certain conduct by employees when dealing with the press. For example, the threshold may be press inquiries. There is no way to bar press inquiries, but they can serve as a threshold for managerial influence, which would translate into a directive such as, "Whenever someone from the press calls to ask a question, your manager and the communications office should be notified immediately. No answers should be provided without their prior consent." This threshold clearly indicates to personnel the limits of behavior and the appropriate response when the threshold is breached.
Many individuals would argue that common sense is not very common, and thresholds are a response to that contention. Thresholds ensure that everyone has a clear understanding of which behaviors and responses are and are not appropriate. They preclude the action of organizational "cowboys" whose rash behavior isn't in keeping with prevailing practice and culture. They also serve the significant purpose of establishing risk culture within an organization or even within a specific organizational practice.
As organizations try new approaches and procedures, they sometimes put personnel in a confusing position in which employees don't know how to respond. IT professionals with many years of experience who know the organization well may nonetheless have great difficulty in their first encounter working a trade show booth. The environment, the people they deal with, and the communication skills required are all radically different. In this situation, it's not uncommon for marketing personnel to say, "Questions about the pricing structure and delivery should come to me. Don't try to answer them for me." That's a threshold. It represents marketing personnel clarifying the division of labor by indicating which responsibilities do not belong to technical staff. It's easy to understand that if too many of these thresholds are established, they can become lost in a sea of information. If someone forgets to tell a new staff member about thresholds, he or she can quickly find themselves in trouble for violating organizational mores.
With tolerances and thresholds in place, it becomes possible to establish risk triggers. Triggers are the indicators that a threshold or tolerance is about to be breached. They are the early warning signals that the organization is drawing precariously close to a specific level of concern and, in many cases, that some action is appropriate. When a person's cholesterol level exceeds 200, most doctors now notify the patient that he or she exhibits a risk factor. If his or her cholesterol level exceeds 250, doctors insist on medication to reduce the cholesterol level to avoid the risk of a heart attack. These levels (which have varied over time) are the triggers for which physicians recommend action in order to preclude a heart attack "tolerance" from being breached.
Similarly, organizations can establish triggers that recommend action in the event of an impending tolerance or threshold violation, thereby empowering personnel to take action in response. The key is that triggers should reflect the sense of urgency within the organization and the degree of concern that's associated with the trigger, if not the threshold as well.
Mandatory Practices: Preparing Personnel
The term "mandatory" is often seen as a negative in today's business climate. It suggests a lack of empowerment and the limiting of freedom of those who might take alternative approaches, but nothing could be further from the truth. In fact, mandatory processes and procedures open the door for personnel to make intelligent decisions based on what management can and cannot handle. Mandates afford managers and leaders the ability to put personnel on remote control. If mandates are in place and enforced, and if they support the tolerances and thresholds of the organization, their clarity should prevent personnel from balking at their implementation.
In his book True Professionalism, David Maister examines how people can respond effectively to a risk warning [8]. According to Maister, in the face of risk, some steel themselves for potential problems; others disintegrate. Maister subscribes to the business principle that employees perform most effectively when they know their objectives and know how they fit into the big picture, when they receive the necessary information to do the job right, when they get feedback on current progress, and when they feel like they're part of a team.
The same principles apply to risk management. Team members must know their risk objectives. They must have a clear sense of how much the organization will tolerate in terms of risk and when it's time to pull back. According to management expert Peter Drucker, objectives must be unambiguous, realistic, and collectively crafted and understood between personnel and management [3]. They must also be clearly stated. For example, the directive "Avoid safety risks at all costs" is relatively weak and unclear, begging the question "Which safety risks?" Does this mean all costs? A clear risk objective statement is carefully worded, such as, "When a safety risk becomes evident that could waste time, management must be notified within 24 hours." This language tailors the risk, clarifies the course of action, and validates the relative magnitude of the potential problem.
Given a specific objective, employees know readily if they have succeeded or failed in serving the objective. Such awareness is crucial to success in creating an inclusive risk culture where personnel feel like they are part of the process. As Maister suggests, knowing one's role in the larger organizational scheme is important. But creating clear roles is a greater challenge than it might appear on the surface. Specifically, some organizations' cultures encourage intense individualism. This "cowboy mentality" flies in the face of effective risk management because it encourages individuals to make risk decisions without structure or advanced approval [9]. If such rampant individualism extends to an organization's risk management, no individual can understand his or her role in the organization's approach to risk because the organization lacks a risk scheme.
During the 9/11 attacks, a company occupant of the World Trade Center towers, Morgan Stanley Dean Witter, provided a classic example of steadfast adherence to clarity and consistency as risk management strategy. In an article for the Washington Post, journalist Michael Grunwald recounts the risk management practices of the late Rick Rescorla, Morgan Stanley's VP for corporate security. A zealot for following process, Rescorla had drilled his personnel on following procedure. When the World Trade Center was struck, "he immediately ordered an evacuation of all 2,700 employees in Building Two, as well as 1,000 Morgan Stanley workers in Building Five across the plaza. They walked down two stairways, two abreast, just as they had practiced" [6]. According to Grunwald, Morgan Stanley had a high survival rate relative to other companies at the World Trade Center because Rescorla developed "an automatic flight response ..., [burning] it into the company's DNA." Brokers hated the fire drills and staff would try to stay behind, but risk practices had become part of the culture and of employees' day-to-day routines. They lived with it, and they survived because of it.
In order to lay the groundwork for personnel to feel included in the process, an environment of proactive risk management (rather than fire fighting) must prevail. In a fire-fighting environment, individualists take the reins whenever a problem or serious concern surfaces. In the worst case for organizations striving for consistency, firefighters are frequently rewarded and praised, while those who work proactively are relegated to the background. In such an environment, individualism proliferates and generates even more individualistic behavior. As one manager put it, "Anytime an organization has more than a couple of firefighters, at least one of them is an arsonist." Establishing the risk structure requires creating an environment in which fire prevention, not just heroism, earns rewards.
But doing so means that people are rewarded simply for following process and protocol. It means that those who "come to the rescue" must document their activities and provide rationale so that others may repeat them. It means that when reviewing individual performance, human resources departments must consider those individuals who simply follow the rules in a more laudatory light than they currently do. The greatest challenge is avoiding the mountains of praise and reward generally associated with a heroic effort and instead directing such praise at those whose work doesn't require heroism but steadfast adherence to procedure.
Furthermore, effective risk strategy is based on the understanding that managing risk is not the exclusive province of the executive suite. For ownership of a risk culture to evolve, practices must become built in through staff input; staff must have a stake in the game. For example, the first passengers to realize that the Titanic was in trouble were those below the water line. Personnel who deal with risk on a daily basis are best able to discover an organization's propensity for risk because they can identify where processes are most likely to fail and where the organization is most vulnerable. These employees understand the organization's processes, capabilities, and limitations at a grassroots level. By encouraging these people to catalog risks and by affirming their perspective, it's more likely that they will accept the input and insight of others who have engaged in the same process.
Implementing such an effort in an IT environment poses certain challenges. Specifically, the widely varied skill levels at the grassroots create a disparity in perceptions of such an effort. Because of the creative nature of much IT work, some may balk at the imposition of structure. Because of this potential resistance, it's important to inculcate risk practice as integral to daily work habits rather than as a separate administrative process.
Preparing personnel for new structures is analogous to adjusting to new habits when you buy a house in a new town: you quickly learn which behaviors are acceptable and which are not. Untrimmed hedges may be all right in one part of town; while in another, neighbors will not take kindly to a forsythia gone wild. Police may tolerate a 10-mile-per-hour violation on one street; while on another, it's akin to a capital offense. The locals who have lived there for years know and understand the rules; new residents do not. In this situation, the best friends of new residents are those who take a few minutes to outline where they can avoid headaches and how they can best fit into the new community. While explaining such "rules" can be awkward, the conversation also proves invaluable for expediting the acclimatization process.
Using Triggers and Thresholds to Establish Risk Strategy
Nowhere is this principle more evident than in how and when an organization accepts or avoids risks. Acceptance, the most commonly applied risk strategy, deems some risks acceptable and not requiring further action before they may occur. Prior to the occurrence of these risks, there's no real need for action or investment. Daily, most people accept the risk that they could be killed behind the wheel of their car, despite the fact that it is a leading cause of death among adults in the US. Most people accept the risk because they believe they have a measure of control over this assumed risk and that medical technology would come to their rescue after the fact. They also accept this risk because it is culturally appropriate to accept it; most drivers accept it. Yet most drivers will not accept the risk of driving while intoxicated. When they know they've had too much to drink, they hand someone else the keys or call a cab.
Drunk driving laws clearly establish thresholds (on a state-by-state basis): .08% blood alcohol content (BAC) is the legal threshold for drunk driving throughout most of the US. That threshold has been moving steadily downward since 1982, when most states had a BAC of .10%. With such a threshold, it is relatively easy to tell when someone is inching toward a higher BAC: he or she is consuming more alcohol. Even without a meter, it's relatively easy to discern when someone has potentially had too much to drink. There are multiple (i.e., more than two) beer cans at his or her side. Note that now there is not only a threshold but also a trigger to indicate that the threshold may be violated.
In the business world, and particularly in the IT world, projects sometimes move forward inexorably, consuming resources and time. Customers become frustrated to the point of exasperation. By the time they hit the organizational tolerance of "We don't do things that will cost us customers," it may be too late. Setting a threshold (and identifying triggers to indicate that the threshold is about to be breached) gives personnel the ability to act independently. Staff can know it is time to take action, and they can take action more independently. They don't need advisories on when or how to act. It's become part of the culture.
Since we don't want to unleash a plethora of thresholds and tolerances that overwhelm our personnel, IT executives should select those occurrences that they absolutely cannot tolerate; the list should comprise the elements of organizational culture that put the strategies, protocols, and practices of the organization (if not the organization itself) in jeopardy if they are violated.
By way of example, the Mohave Generating Station at Southern California Edison (SCE) has a very low tolerance for injury during its annual boiler outages. The company takes extraordinary measures to ensure that no one is injured during these multiweek events involving hundreds upon hundreds of workers. The tolerance for unsafe behavior is zero. Thus, if any lost-time accidents occur, even if they are extremely minor in nature, a threshold has been crossed. Managers are notified as soon as these events happen. Investigations are conducted. Specific actions are taken. Why? Because the outcome of anything more serious is far beyond the organizational level of acceptability. The key is that resident personnel, vendors, consultants, and contractors all know where the lines are drawn. The Edison procurement staff incorporates language to protect Mohave in its contracts. Personnel undergo safety training time and again. Posters throughout the facility depict critical safety issues. Is this an environment in which the line of acceptability is crossed? Yes, albeit accidentally. Does anyone in the organization believe that management willingly accepts such behavior? No. Why? Because ensuring personnel safety has become part of SCE's culture. It's part of the organization's environment. It's the way in which it does business. Are there many such thresholds in place at the generating station? No, because it would undercut the emphasis and impact of those that are most important.
Determining the appropriate areas of risk concern should be an investigative process that takes one no further than the last annual report or the last major internal investigation. What issues have cost the organization dearly in the past? What concerns have surfaced and captured the time and attention of executive boards? What issue(s) dominate(s) the reviews, analyses, and reports of senior management? The list is generally not a long one. In a given year, typically only a handful of problems resurface with sufficient frequency to be considered part of the culture.
After 9/11, physical plant security and technical security across the US and around the world became major issues. Prior to the attacks, minor virus threats might not have escalated beyond the walls of a network administrator's office. In the days following 9/11, however, each security breach became a major ordeal for many organizations. No matter the scale or scope, concerns regarding security were escalated. Thresholds were tightened dramatically. Over time, different concerns drive different behaviors. For some organizations, the three years that have passed since the 9/11 tragedies has served to dull their sensitivity to security issues, while for others the focus remains. The key is to communicate an understanding of the organization's priorities and how to detect those priorities.
In order to establish detection methodologies, consider what might serve as an accurate early-warning system. For some major IT organizations, Internet volume might provide the metric. For others, help-desk call volume may be the indicator. In still others, calls or visits from major clients might be the key. The critical element is to identify triggers that the average staffer can readily detect and triggers that don't instantly point an accusatory finger at either the person reporting the episode or at his or her peers (see Table 1).
Table 1 -- Quality criteria used by survey respondents to evaluate and select service providers. (Respondents able to choose more than one category.)
| Area of Concern | Threshold | Sample Trigger |
| Environmental pollution | We won't do anything that causes permanent harm to the ecosphere. | Any spills of nonbiodegradable material will be escalated to the supervisor level. |
| Negative press | We won't take actions that the public might perceive as unethical or politically incorrect. | Contact from regulators, media representatives, concerned citizens, or lawyers will be escalated to the VP level or to the communications office. |
| Physical attack on the facility | We will not tolerate physical threats to personnel in our facilities. | Any suspicious material identified by the metal detector and confiscated will be logged and reported to the security director. |
Note that triggers have different types of escalation. While materials confiscated at the entryway to a physical plant may simply be logged in a report for later review, spills and media contact are escalated far more rapidly through the chain of command. Some triggers require a higher level of sensitivity; some do not. The key is that triggers should reflect the sense of urgency within the organization and the degree of concern that's associated with the trigger, if not the threshold as well.
The emphasis here is that some triggers must be in place and some thresholds must be clarified. The organization's areas of significant concern can be addressed only if personnel have a clear vision of their role in addressing them and in participating in the risk management process.
Risk Reporting and Controls
If risk has a bad reputation for being depressing and pedantic, risk reporting has this reputation in spades. The only task worse than trying to predict which bad things may happen to the organization and alerting others to them is reporting on those predictions. It takes a process that involves some potentially negative information and then forces consistent cataloging and documentation of that information. It sounds like a process doomed to frustration.
But in fact, there's a bright side to risk reporting that's often overlooked: we can catalog what was done correctly and which steps taken were proactive in dealing with potential problems. In this sense, one should view risk reporting from a public relations perspective. Risk reporters have the opportunity to achieve efficacy and then to inform others about what was achieved. Much of the success or failure of risk reporting comes in the presentation rather than the substance of the message.
Specifically, personnel should be advised that we expect to know when they have accomplished great deeds. Just as management should clarify when risk strategies such as acceptance and avoidance are the norms, the message should also go out when it is time to report back on such activities. Policy on risk reporting should clearly state when reporting should occur and under what circumstances. For example, when the space shuttle Challenger exploded, NASA went into high gear in terms of risk reporting. The agency put its performance under a microscope from both a human and a mechanical perspective. The actions of employees were called into question. The performance elements of the vehicle were intensely scrutinized. Virtually no stone was left unturned [12].
NASA began its extensive period of self-scrutiny because the risk environment had changed: a vehicle had exploded. The earlier rules of risk management were radically altered because of the magnitude of the event. Environmental change goes beyond simple tolerances. It's a moment when the foundational assumptions of risk practice may have changed. Environmental change is a main reason for conducting a risk reassessment and review. This is the time when staff should be expected to report on their approaches and how well these approaches function. The question is, "What constitutes environmental change?" Is it when something explodes? When something begins to act anomalous? When a system shuts down? Or when a potential problem has been reported for the second or third time? These various understandings of environmental change are significantly different, and if management does not determine the thresholds for reporting, individuals must take it upon themselves to make the assessment. When staff members act independently to determine that it's time for reporting or reassessment, it can lead to inconsistent, sometimes hazardous, outcomes. Some staff will respond with the belief that they should report virtually everything to management, while others will take the perspective that no report is a good report.
Change is a good indicator that there's a need for risk reporting, but we should also expect reports on a regular cycle. In some industries, different times of year carry different burdens. Risk assessment in retail-driven industries might occur in late September as these businesses gear up for the fourth-quarter holidays. For those in the federal sector, risk assessment might occur in May or June in preparation for government agencies' end-of-year spending spree. The key is to establish a period prior to critical moments in the organization's business calendar to conduct an assessment of what should be happening and what has happened risk-wise.
Risk evaluations and assessments should look at both successes and failures. If dozens and dozens of specific, nonfinancial business risks were identified and none came to pass, that information should be logged. It helps establish probability metrics for the future and clarifies why the organization might wish to invest less time and energy in these efforts in the future. If specific negative events occurred, and we took no action to preclude them from happening, this too should be logged, providing insight on where we should invest time and energy in the future. If risks came to pass and we were ready for them, or if there is clear evidence that a deployed strategy was effective, it should be logged. Such achievements clarify how and when we came up with the best ideas.
For many organizations, the tragedy is that personnel will often keep those risks in the last category to themselves. Some individuals perceive their ability to serve as the resident miracle worker more valuable (and valued) than their ability to communicate how others can prevent similar incidents from occurring. In such environments, the risk log will be a sterile document that simply lists negative events that occurred. As discussed previously, laying the groundwork with personnel is the only means by which such an environment can be avoided.
Categorization and Classification
The beginning of established, ingrained risk practice is to start the conversation and to build the glossary. Terms and terminology are everything to a nascent risk practice. Just establishing the difference between a risk (a future phenomenon that may be a detriment to the organization) and an issue (a risk whose time has come) can help clarify why some staff members act the way they do. Language can be passed down orally or in text, but it needs some measure of formalization.
The French have taken this to the extreme. The Académie française, a centuries-old institution devoted to the preservation of the French language, ensures that words are used only in approved ways in "true" French language. While such an approach may seem draconian, it ensures the maintenance of a culture. In the organizational environment, we may not have a similar language police, but we can use daily interaction, memoranda, e-mail, and other forms of communication to reinforce the appropriate use of language. When someone sends an e-mail inquiring about an issue, there's nothing wrong with querying his or her intent, as in, "Do you want to address this as if it has already happened? Or is it a 'risk,' still out there, waiting to happen?" Small clarifications are first steps down the road to shared understanding and culture.
If language is not the ideal starting point in your organization, consider the elements that are in play. Have tolerances been established? Do personnel know which actions and activities would be considered unacceptable? By simply identifying those components throughout the organization, staff members discover what is and isn't important in terms of tolerances, capabilities, and management action. Simply laying out thresholds or triggers enables team members to take risk actions independently, paving the way for consistency in risk practice.
The Software Engineering Institute (SEI) has already taken a long hard look at IT organizations. In its Taxonomy-Based Risk Identification, SEI categorizes software organization risks into three groups: product engineering, development environment, and program constraints [1]. Within each area, categories and subcategories further highlight specific areas of potential concern. But even the 65 subcategories provide no indication of the specific propensities of a given organization. The SEI model, for example, would not highlight Microsoft's history of challenges regarding security in its Windows interface. The model does not give weight to the categories; it only identifies which categories exist.
Checklists, Surveys, and Other Strategies
Thus it falls to the organization to emphasize specific risk areas by defining those that are most common. What's the best resource for identifying these risks? History. Where has the organization slipped in the past? Where does the organization have a history of pain and suffering? Through simple survey techniques or one-on-one interviews, such information can be drawn out of personnel at all levels. The most honest results can be achieved by posing questions that are open-ended and subject to a modest amount of interpretation. Personnel will respond more positively to the process if the questions are couched in a positive fashion. Note that the questions provided here emphasize prevention rather than history. It's also important for the questions to place the lion's share of the burden for such preventive action on the shoulders of management rather than on line personnel. Team members do not want to be perceived as negative, but they do want the ability to raise warning signals when those warnings are appropriate and in the best interests of the organization. Here are some sample survey questions:
- If you could name three things that you believe
hinder the organization from achieving excellence on
a regular basis, what would they be?
- For management to act proactively, about what
specific concern should it be vigilant?
- In the next year, what's one problem that
organizational management should watch for?
While open-ended survey questions are one approach to drawing out a basic understanding of an organization's risk, they can create such disparate responses that it becomes impossible to reconcile the list down to a reasonable few categories of concern. If that's the case, it may be more appropriate to assemble a larger laundry list of risks (such as the SEI taxonomy) and simply ask key personnel to select the 10, 15, or 20 that are most germane to the organization. The advantage of this approach is that it takes far less time to complete than a more open-ended survey. For the same reasons, however, it may generate less thoughtful responses than the longer-form approach.
The checklist approach also creates the possible perception that management has stacked the deck to ensure that its areas of concern are highlighted while additional important or specific risks and risk areas are relegated to the "other" category. Still, the approach at least provides a sense of focus as to which categories drive the organization's risk culture. Here is a sample set of categories that could evolve into a checklist for an organization's areas of greatest concern:
- Requirements
- Design
- Testing
- Security
- Configuration management
- Management support
- Procurement
- Working conditions
No matter the approach, it's important not to discount the output. Sometimes it's tempting to believe that an organization has become adept at dealing with certain risk areas or has become inoculated to the effect of these risks, but this is rarely the case. If such an attitude prevails, any hope of long-term resolution or elimination of these risks cannot occur.
The US Army found a creative alternative to looking into a crystal ball through its development of the Future Years Defense Program (FYDP) [10]. Rather than have staff look at the existing organization and ask, "What should we worry about?" the Army raised the question in a different fashion. Dubbed "Force 21," the project created a hypothetical army that was capable, effective, and lean and met all the challenges of the future battlefield. As former Army Chief of Staff Gordon Sullivan put it, "By disassociating Force 21 from the existing processes, it was possible to begin the journey." IT organizations can learn from the Force 21 approach by allowing personnel to envision a future culture of success rather than focusing exclusively on existing concerns. By understanding areas of concern, an organization can narrow the focus of its efforts to the most frequent and damaging events. With those areas clearly identified, members of the organization can begin speaking the same risk language.
Using History to Understand Future Risks
The applicability and utility of these efforts to build a risk infrastructure ultimately rely on the development of a risk history. Organizations cannot ensure consistency of practice without awareness of what happened in the past, and for some, a component of the organization's culture is stories -- that is, its history. Some organizations can cite past acts of heroism and survival. Others can tell tales of woe and despair. These stories convey a sense of the history and risk culture (what is and is not tolerable, what is and isn't worth worrying about), but they don't convey this information consistently. Consistency requires documentation that is both accessible and accessed. Only a handful of organizations have their risk history clearly documented and publicly available. Books such as Show-stopper!, which details the history of Microsoft's efforts to create Windows NT, provide in-depth details of corporate operations, limitations, and actions in the face of adversity [13]. But few organizations have their histories available on Amazon.com. For most, garnering that information involves reliance on oral histories and extracting snapshots of the past from personnel with the longest tenures.
Organizations can capture their histories through various media and with a variety of existing tools. Corporate newsletters can be powerful vehicles to promote and capture well-managed risks. A few lines of text in the corporate record can preserve the essence of how the thresholds, tolerances, and generalized strategies have served the organization well.
Computerized archives are also swelling in popularity. Some organizations have opted for tools such as Integrated Computer Engineering's Risk Radar 1 to capture their experiences. These tools encourage personnel to log information in a corporate database to render it searchable by category. The downside of such repositories is that they require some level of administration and are best maintained by those who have experience with library classification systems. They are often constructed with the best of intentions, but when resources run thin, organizations may allow them to lie fallow. The only risk databases that will survive over the long term are those that are maintained according to a maintenance plan, as discussed below.
Myriad risk tools are at an organization's disposal. Decision tree software such as Palisade's PrecisionTree 2 and Monte Carlo tools such as C/S Solution's Risk+ 3 are extraordinarily tempting as a means to analyze risks and express them "scientifically" through a sea of mathematical data. Some organizations seek to integrate their organizational risk with financial tools to determine the magic equation that will result in consistently "safe" business outcomes. But if the organization lacks an entrenched risk culture, such initiatives put the proverbial cart before the horse. Without consistent language on the nature of risk and risk events, attempts to capture this information using quantitative tools ultimately yield a confusing blend of apples and oranges of information. If the organization cannot consistently determine which risks will be accepted or avoided, then mathematical consistency of outcomes is impossible.
The temptation, of course, is that many of the more advanced risk tools are readily available, require little organizational acculturation, and provide attractive outputs. The downside is that these outputs have limited value, particularly for organizations that cannot honestly identify their tolerances in terms of corporate, staff, vendor, or client behavior. While mathematical models may appear to offer a high level of confidence that a particular strategy will succeed, if these models' underlying assumptions don't stand the test of consistency, neither will the models themselves.
The best initial steps forward take the organization down the road of clearly understanding what risk terminology means, what tolerances and thresholds exist, and how the risk conversation will be conducted.
Risk Management Maintenance
Over time, risk practices either flourish or lie fallow. The key to ensuring a vibrant, thriving risk culture is constant reiteration and reinforcement. The attitudes and perspectives that management strives to inculcate into the organization must become part of the daily routine. The ability to refresh the culture is directly correlated to the ability to capture an organization's risk history. If risk histories are well established, it's easier to tell what's no longer a high priority on the list of risks and what's new.
A classroom setting in Denver, Colorado, USA, provides a useful example of how reinforcement aids risk objectives. In a Defense Finance and Accounting Service (DFAS) class I taught, I asked participants whether they could identify their professional mission or vision statement or aspects of these objectives. Immediately, each individual in the room reached into his or her pocket and pulled out a small blue card. In unison, they began almost to chant: "To provide responsive, professional finance and accounting services for the people who defend America."
Then they put the small cards back in their pockets. It was an amazing professional moment. Every person in the room knew the mission. They knew what management cared about. They had the words at their fingertips. There was no doubt, no hesitation about what had been reinforced. Management had found a way to get consistency from their personnel and to get all at least to recite the same gospel.
In risk cultures, the objective may not be to get personnel to carry around small cards listing management thresholds and triggers, but the most desirable outcome is behavior similar to that of the DFAS students. When asked, "What will your management not tolerate?" all personnel should have the same answer. When asked, "When will you alert your management that a risk is on the horizon?" staff should share a common vision as to the severity of the risk. When someone identifies a risk as high, the company's definitions of "high," "medium," and "low" should resonate for every employee in a similar fashion.
Part of establishing an effective risk culture is to ensure openness and candor in discussions. Organizations often quash any hopes of an effective risk program by acting punitively when risks are brought to the fore. The classic syndrome of killing the messenger can wipe out the risk discussion. By setting effective tolerances and triggers, and reacting to them consistently, management sends a clear message about how and when the risk discussion can take place. It becomes possible to tell staff which discussions do and do not need to be escalated. In an organization with a healthy risk culture, field personnel know they are empowered to handle the "small" risks, while risks of greater significance must be escalated to higher levels in organizational hierarchy. Staff isn't confused about whether or not escalation is appropriate; it's just a matter of identifying the risk and knowing the culture.
In such an environment, management's response largely determines whether or not the culture survives. If staff members are confident that management will provide advice, guidance, or authorization when concerns are raised (at the appropriate level), these concerns will be raised. By contrast, if staff expects management to respond with a "Why didn't you handle this?" attitude, the culture will die on the vine.
Refreshing a risk culture is equally important. Just five years ago, terrorist threat risks were extremely low on most corporate radars. Scant attention was paid to these risks. Today, a risk plan or disaster recovery plan that doesn't consider this possibility is considered sorely remiss.
Most of the shifts over time are organic in nature. As with 9/11, risks become obviously significant. Others make themselves evident through sheer repetition. Management (or staff) sees the same incidents repeatedly and thus can recognize a trend. Such evolution can happen only if the history is visible and reviewed on a regular basis. Whether that basis is weekly, monthly, quarterly, or even semiannually, someone must have the responsibility for ensuring that current tolerances remain valid and appropriate and that the general body of risks hasn't shifted. That "someone" can be at almost anyone at the management level, but he or she must be aware of corporate acceptance (or lack thereof) and tolerance for risk. And when they see those levels shifting, the red flag must be raised.
CONCLUSION
Red flags can be raised only when there is a consistent understanding of how and why they've been raised. When organizations put triggers, thresholds, and controls in place using consistent application, it increases the probability that red flags won't need to be raised at all.
In the fall of 2004, the Washington, DC, Metrorail suffered its worst accident in decades when two train cars on the Red Line collided. As one train drifted backward on the tracks, another moved forward. Following the accident, the National Transportation Safety Board (NTSB) determined that the collision resulted from the assumption by the train operator in the drifting train that safety systems would stop the trains from rolling back too far before anyone was at risk [2]. The operator believed he understood the system. But failsafe mechanisms weren't there. And the operator did not recognize when the problem had gotten out of hand.
The NTSB urgently recommended increased driver training to ensure that operators understand when the systems take care of risks by default and when the operators take over responsibility for safety. In this small way, the event prompted the Washington Metropolitan Area Transit Authority to create a risk culture. It is creating a language, standard practices, and a consistent understanding of what risks will and will not be tolerated.
To inculcate a practice is to make it second nature. If 20 years ago you predicted that businesspeople would have to take computers and other work-related devices virtually everywhere, they would have scoffed. If you said the professional norms would be to log in to the home office daily and respond to queries in less than 48 hours, they might think you had misspoken. But today, an interconnected, fast-paced economy is part and parcel of getting business done.
So it can (and should) be with risk management. Making risk management a part of the culture, however, means making it part of the routine. When Morgan Stanley's Rescorla made the staff march down the stairs for fire drills, at first it was a nuisance, but as time wore on, it became part of the rigors of the office. Ultimately, the behavior was ingrained. So it should be with identifying risk, understanding tolerance levels, and documenting responses and behaviors.
The future bodes well for organizations that understand and can work with their risks consistently. No organization can create a risk-free environment. But over time, the capacity to make risk a key component of the day-to-day practices, procedures, and conversations of the company opens the door for proactive, effective understanding and management of the vagaries that all organizations must face.
NOTES
1For more information, see www.iceincusa.com/products_tools.htm.
2For more information, see www.palisade.com/html/ptree.asp.
3For more information, see www.cs-solutions.com/products/?Product=Risk%20Plus.
ABOUT THE AUTHOR
REFERENCES
1. Carr, Michael et al. Taxonomy-Based Risk Identification. Software Engineering Institute, CMU/SEI-93-TR-006, 1993.
2. Conners, Ellen Engerman. "Urgent Safety Recommendation R-04-09." National Transportation Safety Board, 22 November 2004.
3. Drucker, Peter F. Management: Tasks, Responsibilities, Practices. Harper Perennial, 1973.
4. Ezard, John. "Rumsfeld's Unknown Unknowns Take Prize." Guardian, 2 December 2003.
5. GAO. Chemical Risk Assessment: Selected Federal Agencies' Procedures, Assumptions, and Policies, GAO-01-810. GAO, August 2001.
6. Grunwald, Michael. "A Tower of Courage." Washington Post Magazine, 28 October 2001.
7. Kutnick, Dale. "The Externalization Imperative." CIO, 1 January 1999.
8. Maister, David H. True Professionalism: The Courage to Care About Your People, Your Clients, and Your Career. Free Press, 1997.
9. Room, Adrian. Brewer's Dictionary of Phrase & Fable. HarperResource, 2000.
10. Sullivan, Gordon, and Michael Harper. Hope Is Not a Method. Broadway, 1997.
11. Unger, Harlow Giles. Noah Webster: The Life and Times of an American Patriot. John Wiley & Sons, 1998.
12. Vaughan, Diane. The Challenger Launch Decision: Risk, Technology, Culture, and Deviance at NASA. University of Chicago Press, 1996.
13. Zachary, G. Pascal. Showstopper!: The Breakneck Race to Create Windows NT and the Next Generation at Microsoft. Free Press, 1994.
Creating a Risk Culture in an IT Environment