Comparative Risk Assessment Form
FAA System Safety Handbook, Appendix B: Comparative Risk Assessment (CRA) FormDecember 30, 2000
B-1
Appendix B
Comparative Risk Assessment Form
FAA System Safety Handbook, Appendix B: Comparative Risk Assessment (CRA) Form
December 30, 2000
B-2
SEC TRACKING No: This is the number assigned
to the CRA by the FAA System Engineering Council
(SEC)
CRA Title: Title as assigned by the FAA SEC
SYSTEM: This is the system being affected by the change, e.g. National Airspace System
Initial Date: Date initiated SEC date: Date first reviewed by the SEC
REFERENCES: A short list or references. If a long list is used can be continued on a separate page.
SSE INFORMATION
SSE Name/Title:
Name and title of person who
performed or led team
Location:
Address and office symbol of
SSE
Telephone No.:
SUMMARY OF HAZARD CLASSIFICATION:
(worst credible case; see List of Hazards below for individual risk assessments)
Option A (Baseline): Place the highest risk
assessment code for the baseline here
Proposed Change
Option(s) B-X: Place the highest risk assessment
code for the alternatives here.
DESCRIPTION OF (Option A) BASELINE AND PROPOSED CHANGE(s)
Option A: Describe the system under study here in terms of the 5 M Model discussed in chapter 2.
Describe the baseline (or no change) system and each alternative. This section can be continued in an
appendix if it does not fit into this area. Avoid too much detail, but include enough so that the
decision-maker has enough information to understand the risk associated with each alternative.
SEVERITY:
1 CATASTROPHIC ¨C Death, system or aircraft loss, permanent total
disability
2 HAZARDOUS - Severe injury or major aircraft or system damage PROBABILITY
3 MAJOR - Minor injury or minor aircraft or system damage SEVERITY A B C D
4 MINOR ¨C Less than minor injury or aircraft or system damage 1
5 NO SAFETY EFFECT 2
PROBABILITY: 3
A PROBABLE - Likely to occur in lifetime of each system
(> 1E-5)
4
B REMOTE ¨C Possible for each item, several for
system (< 1E-5 )
5 No risk
C EXTREMELY REMOTE ¨C Unlikely for item, may occur few in
system (< 1E-7)
D EXTREMELY IMPROBABLE ¨C so unlikely, not expected in system
(<1E-9)
FAA System Safety Handbook, Appendix B: Comparative Risk Assessment (CRA) Form
December 30, 2000
B-3
HAZARD LIST
No. Hazard Condition RISK ASSESSMENT CODE (RAC)
List the hazard conditions here. Enter the risk
assessment codes for each hazard ¨C alternative
to the right.
Baseline
Option A
Option
B
Option
C
Option D Option E
1 Loss of communication between air traffic
controllers and aircraft (flight essential)
1D 1D 1C 1C 1B
2 Loss of communication between air traffic
controllers in different domains (ARTCC to
ARTCC, ARTCC to TRACON, etc.)
1D
3 Loss of communication between air traffic
controllers and flight service (flight plans, etc.)
4 Loss of communication between air traffic &
ground controllers and vehicles in the airport
movement area
5 Loss of the means for operator and flight
service to communicate information relative to
planned flight
6 Loss of the capability to detect, classify, locate,
and communicate adverse weather such as:
thunderstorms, rain and snow showers,
lightning, windshear, tornadoes, icing, low
visibility or ceilings, turbulence, hail, fog, etc.
7 Loss of navigation functions providing aircrew
with independently determined 3D present
position of the aircraft, defined routes,
destination(s), and navigation solution (course,
distance) to destination.
8 Loss of Air traffic control determination of 3D
location, velocity vector, and identity of each
aircraft operating in a domain.
9 Loss of Air traffic control determination of
location, identity, and velocity vector of each
participating vehicle operating in the airport
movement area domain.
FAA System Safety Handbook, Appendix B: Comparative Risk Assessment (CRA) Form
December 30, 2000
B-4
10 Loss of approach guidance to runway. Precision
¨C horizontal and vertical guidance; Nonprecision ¨C horizontal guidance, vertical
procedures.
11 Loss of ground vehicle or aircraft operator
independent determination of present position,
destination(s), and navigation solution on the
airport movement area.
12 Hazardous runway surface precludes safe
takeoff or touchdown and rollout.
FAA System Safety Handbook, Appendix B: Comparative Risk Assessment (CRA) Form
December 30, 2000
B-5
SAFETY ASSESSMENT SUMMARY
(Conclusions/Recommendations)
Summarize your conclusions. Which option is best (and 2
nd
, 3
rd
, etc) and why. Include enough detail to
appropriately communicate with the audience.
Recommendations: Provide additional controls to further mitigate or eliminate the risks. Follow the safety order
of precedence, i.e., (1) eliminate/mitigate by design, (2) incorporate safety features, (3) provide warnings, and
(4) procedures/training. See Chapter 4 for further elaboration of the Safety Order of Precedence). Define SSE
requirements for reducing the risk of the design/option(s).
FAA System Safety Handbook, Appendix B: Comparative Risk Assessment (CRA) Form
December 30, 2000
B-6
HAZARD CLASSIFICATION RATIONALE Do one of these sheets for each hazard
1 Hazard: Loss of communication between air traffic controllers and aircraft
Summarization
Summarize the risk assessments for hazard No. 1 for each alternative that was examined.
Baseline Option A
Severity: 1-Catastrophic Probability: E-Improbable Assessment: Medium Risk
Option B
Severity: NA Probability: NA Assessment: NA
Severity
Rationale for Severity:
In this section explain how you came up with the hazard severity. This is where you will convince the
skeptics that you were logical and objective.
The hazard is a component of the hazardous conditions required for NMAC, CFIT, WXHZ, NLA, and
RIA¡¯s. For the baseline NAS system the severity of the ¡°loss of communication¡± hazard is highly dependent
upon the environmental conditions surrounding the event and is therefore categorized as a flight essential
function of the NAS. In a ¡°day, VFR, low density¡± environment the severity is very low resulting in minor
effects. During a night/IFR high-density environment the occurrence of this hazard has a good chance of
becoming catastrophic. The reason for this is that the purpose of this communication system is to provide
aircraft in a region of airspace with direction, clearance, and other services provided by Air Traffic Control
(ATC). In an environment of low outside visibility and many aircraft this function becomes critically
important to air vehicle separation. The following points highlight the severity:
Air Traffic Controllers (ATCs) are able to observe wide volumes of space using airspace surveillance
systems. These systems enable the ATCs to observe the location, velocity, and sometimes the identity of the
aircraft detected by their systems. The ATCs are trained to direct the flow of traffic safely to prevent midair
collisions, flight following, approach clearances, and emergency assistance.
Loss of the entire communication system would result in the rapid onset of chaos as approaching aircraft
attempt to land and enroute aircraft converge on navigation waypoints and facilities. The risk of mid air is
high in these conditions.
In the event that a loss of communication occurs, then complex emergency procedures are established for
IFR and VFR aircraft. The procedures are necessarily complex and if followed should result in a safe
landing, but once initiated can be difficult to follow especially for a single pilot in IFR.
The AIM states ¡°Radio communications are a critical link in the ATC system. The link can be a strong bond
between pilot and controller or it can be broken with surprising speed and disastrous results¡±.
i
FAA System Safety Handbook, Appendix B: Comparative Risk Assessment (CRA) Form
December 30, 2000
B-7
Probability
Rationale for Probability:
Use this section to explain how you derived the probability. This may be quantitative or qualitative. In
general, the higher risk items will require more quantitative analysis than low or medium risk hazards. The
example below is qualitative.
Many controls exist to preclude this hazard from occurring-
Multiple radios both in the aircraft and in the ATC facility provide redundant communication channels from
aircraft to ATC.
In the event of failure multiple facilities can be used including FSS, other ARTCC, TRACON, or ATCC,
even airborne telephones.
1. Planning systems assist in keeping aircraft at different altitudes or routes. Emergency procedures exist to
ensure an aircraft in ¡°lost communication¡± will not converge on another aircraft¡¯s flight path.
1
Federal Aviation Administration. (1995). Airman¡¯s Information Manual. Para. 4-2-1.
FAA System Safety Handbook, Appendix B: Comparative Risk Assessment (CRA) Form
December 30, 2000
B-8
Severity Definitions
Catastrophic Results in multiple fatalities and/or loss of the system
Hazardous
Reduces the capability of the system or the operator ability to cope
with adverse conditions to the extent that there would be:
Large reduction in safety margin or functional capability
Crew physical distress/excessive workload such that operators
cannot be relied upon to perform required tasks accurately or
completely
(1) Serious or fatal injury to small number of occupants of aircraft
(except operators)
Fatal injury to ground personnel and/or general public
Major
Reduces the capability of the system or the operators to cope with
adverse operating condition to the extent that there would be ¨C
Significant reduction in safety margin or functional capability
Significant increase in operator workload
Conditions impairing operator efficiency or creating significant
discomfort
Physical distress to occupants of aircraft (except operator)
including injuries
Major occupant illness and/or major environmental damage, and/or
major property damage
Minor
Does not significantly reduce system safety. Actions required by
operators are well within their capabilities. Include
Slight reduction in safety margin or functional capabilities
Slight increase in workload such as routine flight plan changes
Some physical discomfort to occupants or aircraft (except
operators)
No Safety Effect
Has no effect on safety
Government References (
FAA System Safety Handbook, Appendix C: Related Readings in Aviation System SafetyDecember 30, 2000
C-1
Appendix C
REFERENCES
FAA System Safety Handbook, Appendix C: Related Readings in Aviation System Safety
December 30, 2000
C-2
GOVERNMENT REFERENCES
FAA Order 1810, Acquisition Policy
FAA Order 8040.4 FAA Safety Risk Management
FAA Advisory Circular 25.1309 (Draft), System Design and Analysis, January 28, 1998
RTCA-DO 178B, Software Considerations In Airborne Systems And Equipment Certification,
December 1, 1992 COMDTINST M411502D, System Acquisition Manual, December 27,
1994DODD 5000.1, Defense Acquisition, March 15, 1996
DOD 5000.2R, Mandatory Procedures for Major Defense Acquisition Programs and Major
Automated Information Systems, March 15, 1996
DOD-STD 2167A, Military Standard Defense System Software Development, February 29, 1988
MIL-STD 882D, System Safety Program Requirements, February 10, 2000
MIL-STD 498, Software Development and Documentation, December 5, 1994
MIL-HDBK-217A, ¡°Reliability Prediction of Electronic Equipment,¡± 1982.
MIL-STD-1629A ¡°Procedures for Performing a Failure Mode, Effects and Criticality Analysis,¡±
November 1980.
MIL-STD-1472D, ¡°Human Engineering Design Criteria for Military Systems, Equipment and
Facilities,¡± 14 March 1989.
NSS 1740.13, Interim Software Safety Standard, June 1994
29 CFR 1910.119 Process Safety Management, U.S. Government Printing Office, July 1992.
Department of the Air Force, Software Technology Support Center, Guidelines for Successful
Acquisition and Management of Software-Intensive Systems: Weapon Systems, Command and
Control Systems, Management Information Systems, Version-2, June 1996, Volumes 1 and 2
AFISC SSH 1-1, Software System Safety Handbook, September 5, 1985
Department of Defense, AF Inspections and Safety Center (now the AF Safety Agency), AFIC
SSH 1-1 ¡°Software System Safety,¡± September 1985.
Department of Labor, 29 CFR 1910, ¡°OSHA Regulations for General Industry,¡± July 1992.
Department of Labor, 29 CFR 1910.119, ¡°Process Safety Management of Highly Hazardous
Chemicals,¡± Federal Register, 24 February 1992.
Department of Labor, 29 CFR 1926, ¡°OSHA Regulations for Construction Industry,¡± July 1992.
Department of Labor, OSHA 3133, ¡°Process Safety Management Guidelines for Compliance,¡±
1992.
FAA System Safety Handbook, Appendix C: Related Readings in Aviation System Safety
December 30, 2000
C-3
Department of Labor, OSHA Instructions CPL 2-2.45A, Compliance Guidelines and
Enforcement Procedures, September 1992.
Department of Transportation, DOT P 5800.5, ¡°Emergency Response Guidebook,¡± 1990.
Environmental Protection Agency, 1989d, Exposure Factors Handbook, EPA/600/8-89/043,
Office of Health and Environmental Assessment, Washington, DC 1989.
Environmental Protection Agency, 1990a, Guidance for Data Usability in Risk Assessment,
EPA/540/G-90/008, Office of Emergency and Remedial Response, Washington, DC 1990.
COMMERICIAL REFERENCES
ACGIH, ¡°Guide for Control of Laser Hazards,¡± American Conference of Government Industrial
Hygienists, 1990.
American Society for Testing and Materials (ASTM), 1916 Race Street, Philadelphia, PA. 19103
ASTM STP762, ¡°Fire Risk Assessment¡± American Society for Testing Materials, 1980.
EIA-6B, G-48, Electronic Industries Association, System Safety Engineering In Software
Development1990 IEC 61508: International Electrotechnical Commission. Functional Safety of
Electrical/Electronic/ Programmable Electronic Safety-Related Systems, December 1997
EIC 1508 -(Draft), International Electrotechnical Commission, Functional Safety; Safety-Related
System, June 1995
IEEE STD 1228, Institute of Electrical and Electronics Engineers, Inc., Standard For Software
Safety Plans, 1994
IEEE STD 829, Institute of Electrical and Electronics Engineers, Inc., Standard for Software Test
Documentation, 1983
IEEE STD 830, Institute of Electrical and Electronics Engineers, Inc., Guide to Software
Requirements Specification, 1984
IEEE STD 1012, Institute of Electrical and Electronics Engineers, Inc., Standard for Software
Verification and Validation Plans, 1987
ISO 12207-1, International Standards Organization, Information Technology-Software, 1994
Joint Software System Safety Committee, "Software System Safety Handbook", December 1999
NASA NSTS 22254, ¡°Methodology for Conduct of NSTS Hazard Analyses,¡± May 1987.
National Fire Protection Association, ¡°Flammable and Combustible Liquids Code.¡±
National Fire Protection Association, ¡°Hazardous Chemical Handbook¡±
FAA System Safety Handbook, Appendix C: Related Readings in Aviation System Safety
December 30, 2000
C-4
National Fire Protection Association, ¡°Properties of Flammable Liquids, Gases and Solids¡±.
National Fire Protection Association, ¡°Fire Protection Handbook.¡±
Nuclear Regulatory Commission NRC, ¡°Safety/Risk Analysis Methodology¡±, April 12, 1993.
Joint Services Computer Resources Management Group, ¡°Software System Safety Handbook: A
Technical and Managerial Team Approach¡±, Published on Compact Disc, December 1999.
Society of Automotive Engineers, Aerospace Recommended Practice 4754: ¡°Certification
Considerations for Highly Integrated or Complex Aircraft Systems¡±, November 1996.
Society of Automotive Engineers, Aerospace Recommended Practice 4761: ¡°Guidelines and
Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and
Equipment¡±, December 1996.
System Safety Society: System Safety Analysis Handbook, July 1997.
INDIVIDUAL REFERENCES
Ang, A.H.S., and Tang, W.H., ¡°Probability Concept in Engineering Planning and Design¡±, Vol. II
John Wiley and Sons, 1984.
Anderson, D. R., Dennis J. Sweeney, Thomas A. Williams, ¡°An Introduction to Management
Science Quantitative Approaches to Decision Making.¡± West Publishing Co., 1976.
Bahr, N. J., ¡°System Safety Engineering and Risk Assessment: A Practical Approach¡±, Taylor
and Francis 1997.
Benner, L. ¡°Guide 7: A Guide for Using energy Trace and Barrier Analysis with the STEP
Investigation System¡±, Events Analysis, Inc., Oakton, Va., 1985.
Briscoe, G.J., ¡°Risk Management Guide¡±, EG&G Idaho, Inc. SSDC-11, June 1997.
Brown, M., L., ¡°Software Systems Safety and Human Error¡±, Proceedings: COMPASS 1988
Brown, M., L., ¡°What is Software Safety and Who¡¯s Fault Is It Anyway?¡± Proceedings:
COMPASS 1987
Brown, M., L., ¡°Applications of Commercially Developed Software in Safety Critical Systems¡±,
Proceedings of Parari ¡¯99, November 1999
Bozarth, J. D., Software Safety Requirement Derivation and Verification, Hazard Prevention, Q1,
1998
Card, D.N. and Schultz, D.J., ¡°Implementing a Software Safety Program¡±, Proceedings:
COMPASS 1987
Clark, R., Benner, L. and White, L. M., ¡°Risk Assessment Techniques Manual,¡± Transportation
Safety Institute, March 1987, Oklahoma City, OK.
FAA System Safety Handbook, Appendix C: Related Readings in Aviation System Safety
December 30, 2000
C-5
Clemens, P.L. ¡°A Compendium of Hazard Identification and Evaluation Techniques for System
Safety Application,¡± Hazard Prevention, March/April, 1982.
Cooper, J.A., ¡°Fuzzy-Algebra Uncertainty Analysis,¡± Journal of Intelligent and Fuzzy Systems,
Vol. 2 No. 4 1994.
Connolly, B., ¡°Software Safety Goal Verification Using Fault Tree Techniques: A Critically Ill
Patient Monitor Example¡±, Proceedings: COMPASS 1989
De Santo, B., ¡°A Methodology for Analyzing Avionics Software Safety¡±, Proceedings:
COMPASS 1988
Dunn, R., Ullman, R., ¡°Quality Assurance For Computer Software¡±, McGraw Hill, 1982
Forrest, M., and McGoldrick, Brendan, ¡°Realistic Attributes of Various Software Safety
Methodologies¡±, Proceedings: 9 Th International System Safety Society, 1989
Hammer, W., R., ¡°Identifying Hazards in Weapon Systems ¨C The Checklist Approach¡±,
Proceedings: Parari ¡¯97, Canberra, Australia
Hammer, Willie, ¡°Occupational Safety Management and Engineering¡±, 2 Ed., Prentice-Hall, Inc,
Englewood Cliffs, NJ, 1981.
Heinrich, H.W., Petersen, D., Roos, N., ¡°Industrial Accident Prevention: A Safety Management
Approach¡±, McGraw-Hill, 5 Th Ed., 1980.
Johnson, W.G., ¡°MORT ¨CThe Management Oversight and Risk Tree,¡± SAN 821-2, U.S. Atomic
Energy Commission, 12 February 1973.
Kije, L.T., ¡°Residual Risk,¡± Rusee Press, 1963.
Kjos, K., ¡°Development of an Expert System for System Safety Analysis¡±, Proceedings: 8 Th
International System Safety Conference, Volume II.
Klir, G.J., Yuan, B., ¡°Fuzzy Sets and Fuzzy logic: Theory and Applications¡±, Prentice Hall P T R,
1995.
Kroemer, K.H.E., Kroemer, H.J., Kroemer-Elbert, K.E., ¡°Engineering Physiology: Bases of
Human Factors/Ergonomics¡±, 2 Nd. Ed., Van Nostrand Reinhold, 1990.
Lawrence, J.D., ¡°Design Factors for Safety-Critical Software¡±, NUREG/CR-6294, Lawrence
Livermore National Laboratory, November 1994
Lawrence, J.D., ¡°Survey of Industry Methods for Producing Highly Reliable Software¡±,
NUREG/CR-6278, Lawrence Livermore National Laboratory, November 1994.
Leveson, N., G, ¡°SAFEWARE; System Safety and Computers, A Guide to Preventing Accidents
and Losses Caused By Technology¡±, Addison Wesley, 1995
FAA System Safety Handbook, Appendix C: Related Readings in Aviation System Safety
December 30, 2000
C-6
Leveson, N., G., ¡°Software Safety: Why, What, and How, Computing Surveys¡±, Vol. 18, No. 2,
June 1986.
Littlewood, B. and Strigini, L., ¡°The Risks of Software¡±, Scientific American, November 1992.
Mattern, S.F. Capt., ¡°Defining Software Requirements for Safety-Critical Functions¡±,
Proceedings: 12 Th International System Safety Conference, 1994.
Mills, H., D., ¡°Engineering Discipline for Software Procurement¡±, Proceedings: COMPASS
1987.
Moriarty, Brian and Roland, Harold, E., ¡°System Safety Engineering and Management¡±, Second
Edition, John Wiley & Sons, 1990.
Ozkaya, N., Nordin, M. ¡° Fundamentals of Biomechanics: Equilibrium, Motion, and
Defermation¡±, Van Nostrand Reinhold, 1991.
Raheja, Dev, G., ¡°Assurance Technologies: Principles and Practices¡±, McGraw-Hill, Inc., 1991.
Rodger, W.P. ¡°Introduction to System Safety Engineering¡±, John Wiley and Sons.
Russo, Leonard, ¡°Identification, Integration, and Tracking of Software System Safety
Requirements¡±, Proceedings: 12 Th International System Safety Conference, 1994.
Saaty, T.L., ¡°The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation¡±, 2
Nd., RWS Publications, 1996.
Stephenson, Joe, ¡°System Safety 2000 A Practical Guide for Planning, Managing, and
Conducting System Safety Programs¡±, Van Nostrand Reinhold, 1991.
Tarrants, William, E. ¡°The Measurement of Safety Performance¡±, Garland STPM Press, 1980.
OTHER REFERENCES
DEF(AUST) 5679, Army Standardization (ASA), ¡°The Procurement Of Computer-Based Safety
Critical Systems¡±, May 1999
UK Ministry of Defense. Interim DEF STAN 00-54: ¡°Requirements for Safety Related Electronic
Hardware in Defense Equipment¡±, April 1999.
UK Ministry of Defense. Defense Standard 00-55: ¡°Requirements for Safety Related Software in
Defense Equipment¡±, Issue 2, 1997
UK Ministry of Defense. Defense Standard 00-56: ¡°Safety Management Requirements for
Defense Systems¡±, Issue 2, 1996
International Electrotechnical Commission, IEC 61508, ¡°Functional Safety of
Electrical/Electronic/Programmable Electronic Safety-Related Systems¡±, draft 61508-2 Ed 1.0,
1998
Structural Analysis and Formal Methods
FAA System Safety Handbook, Appendix DDecember 30, 2000
D - 1
Appendix D
Structured Analysis and Formal Methods
FAA System Safety Handbook, Appendix D
December 30, 2000
D - 2
D.1 Structured Analysis and Formal Methods
Structured Analysis became popular in the 1980¡¯s and is still used by many. The analysis consists of
interpreting the system concept (or real world) into data and control terminology, that is into data flow
diagrams. The flow of data and control from bubble to data store to bubble can be very hard to track and
the number of bubbles can get to be extremely large. One approach is to first define events from the
outside world that require the system to react, then assign a bubble to that event, bubbles that need to
interact are then connected until the system is defined. This can be rather overwhelming and so the
bubbles are usually grouped into higher level bubbles. Data Dictionaries are needed to describe the data
and command flows and a process specification is needed to capture the transaction/transformation
information. The problems have been: 1) choosing bubbles appropriately, 2) partitioning those bubbles in
a meaningful and mutually agreed upon manner, 3) the size of the documentation needed to understand
the Data Flows, 4) still strongly functional in nature and thus subject to frequent change, 5) though ¡°data¡±
flow is emphasized, ¡°data¡± modeling is not, so there is little understanding of just what the subject matter
of the system is about, and 6) not only is it hard for the customer to follow how the concept is mapped
into these data flows and bubbles, it has also been very hard for the designers who must shift the DFD
organization into an implementable format.
Information Modeling, using entity-relationship diagrams, is really a forerunner for OOA. The analysis
first finds objects in the problem space, describes them with attributes, adds relationships, refines them
into super and sub-types and then defines associative objects. Some normalization then generally occurs.
Information modeling is thought to fall short of true OOA in that, according to Peter Coad & Edward
Yourdon:
1) Services, or processing requirements, for each object are not addressed,
2) Inheritance is not specifically identified,
3) Poor interface structures (messaging) exists between objects, and
4) Classification and assembly of the structures are not used as the predominate
method for determining the system¡¯s objects.
This handbook presents in detail the two new most promising methods of structured analysis and design:
Object-Oriented and Formal Methods (FM). OOA/OOD and FM can incorporate the best from each of
the above methods and can be used effectively in conjunction with each other. Lutz and Ampo described
their successful experience of using OOD combined with Formal Methods as follows: ¡° For the target
applications, object-oriented modeling offered several advantages as an initial step in developing formal
specifications. This reduced the effort in producing an initial formal specification. We also found that
the object-oriented models did not always represent the ¡°why,¡± of the requirements, i.e., the underlying
intent or strategy of the software. In contrast, the formal specification often clearly revealed the intent of
the requirements.¡±
D.2 Object Oriented Analysis and Design
Object Oriented Design (OOD) is gaining increasing acceptance worldwide. These fall short of full
Formal Methods because they generally do not include logic engines or theorem provers. But they are
more widely used than Formal Methods, and a large infrastructure of tools and expertise is readily
available to support practical OOD usage.
FAA System Safety Handbook, Appendix D
December 30, 2000
D - 3
OOA/OOD is the new paradigm and is viewed by many as the best solution to most problems. Some of
the advantages of modeling the real world into objects is that 1) it is thought to follow a more natural
human thinking process and 2) objects, if properly chosen, are the most stable perspective of the real
world problem space and can be more resilient to change as the functions/services and data &
commands/messages are isolated and hidden from the overall system. For example, while over the
course of the development life-cycle the number, as well as types, of functions (e.g. turn camera 1 on,
download sensor data, ignite starter, fire engine 3, etc.) May change, the basic objects (e.g. cameras,
sensors, starter, engines, operator, etc.) needed to create a system usually are constant. That is, while
there may now be three cameras instead of two, the new Camera-3 is just an instance of the basic object
¡®camera¡¯. Or while an infrared camera may now be the type needed, there is still a ¡®camera¡¯ and the
differences in power, warm-up time, and data storage may change, all that is kept isolated (hidden) from
affecting the rest of the system.
OOA incorporates the principles of abstraction, information hiding, inheritance, and a method of
organizing the problem space by using the three most ¡°human¡± means of classification. These combined
principles, if properly applied, establish a more modular, bounded, stable and understandable software
system. These aspects of OOA should make a system created under this method more robust and less
susceptible to changes, properties which help create a safer software system design.
Abstraction refers to concentrating on only certain aspects of a complex problem, system, idea or
situation in order to better comprehend that portion. The perspective of the analyst focuses on similar
characteristics of the system objects that are most important to them. Then, at a later time, the analyst can
address other objects and their desired attributes or examine the details of an object and deal with each in
more depth. Data abstraction is used by OOA to create the primary organization for thinking and
specification in that the objects are first selected from a certain perspective and then each object is defined
in detail. An object is defined by the attributes it has and the functions it performs on those attributes.
An abstraction can be viewed, as per Shaw, as ¡°a simplified description, or specification, of a system that
emphasizes some of the system¡¯s details or properties while suppressing others. A good abstraction is
one that emphasizes details that are significant to the reader or user and suppresses details that are, at least
for the moment, immaterial or diversionary¡±.
Information hiding also helps manage complexity in that it allows encapsulation of requirements, which
might be subject to change. In addition, it helps to isolate the rest of the system from some object specific
design decisions. Thus, the rest of the s/w system sees only what is absolutely necessary of the inner
workings of any object.
Inheritance ¡° defines a relationship among classes , wherein one class shares the structure or
behavior defined in one or more classes... Inheritance thus represents a hierarchy of abstractions, in which
a subclass inherits from one or more superclasses . Typically, a subclass
augments or redefines the existing structure and behavior of its superclasses¡±.
Classification theory states that humans normally organize their thinking by: looking at an object and
comparing its attributes to those experienced before (e.g. looking at a cat, humans tend to think of its size,
color, temperament, etc. in relation to past experience with cats) distinguishing between an entire object
and its component parts (e.g., a rose bush versus its roots, flowers, leaves, thorns, stems, etc.)
classification of objects as distinct and separate groups (e.g. trees, grass, cows, cats, politicians).
In OOA, the first organization is to take the problem space and render it into objects and their attributes
(abstraction). The second step of organization is into Assembly Structures, where an object and its parts
are considered. The third form of organization of the problem space is into Classification Structures
during which the problem space is examined for generalized and specialized instances of objects
FAA System Safety Handbook, Appendix D
December 30, 2000
D - 4
(inheritance). That is, if looking at a railway system the objects could be engines (provide power to pull
cars), cars (provide storage for cargo), tracks (provide pathway for trains to follow/ride on), switches
(provide direction changing), stations (places to exchange cargo), etc. Then you would look at the
Assembly Structure of cars and determine what was important about their pieces parts, their wheels, floor
construction, coupling mechanism, siding, etc. Finally, Classification Structure of cars could be into
cattle, passenger, grain, refrigerated, and volatile liquid cars.
The purpose of all this classification is to provide modularity which partitions the system into well
defined boundaries that can be individually/independently understood, designed, and revised. However,
despite ¡°classification theory¡±, choosing what objects represent a system is not always that straight
forward. In addition, each analyst or designer will have their own abstraction, or view of the system
which must be resolved. OO does provide a structured approach to software system design and can be
very useful in helping to bring about a safer, more reliable system.
D.3 Formal Methods - Specification Development
¡°Formal Methods (FM) consists of a set of techniques and tools based on mathematical modeling and
formal logic that are used to specify and verify requirements and designs for computer systems and
software.¡±
While Formal Methods (FM) are not widely used in US industry, FM has gained some acceptance in
Europe. A considerable learning curve must be surmounted for newcomers, which can be expensive.
Once this hurdle is surmounted successfully, some users find that it can reduce overall development lifecycle cost by eliminating many costly defects prior to coding.
WHY ARE FORMAL METHODS NECESSARY?
A digital system may fail as a result of either physical component failure, or design errors. The validation
of an ultra-reliable system must deal with both of these potential sources of error.
Well known techniques exist for handling physical component failure; these techniques use redundancy
and voting. The reliability assessment problem in the presence of physical faults is based upon Markov
modeling techniques and is well understood.
The design error problem is a much greater threat. Unfortunately, no scientifically justifiable defense
against this threat is currently used in practice. There are 3 basic strategies that are advocated for dealing
with the design error:
1. Testing (Lots of it)
2. Design Diversity (i.e. software fault-tolerance: N-version programming, recovery blocks, etc.)
3. Fault/Failure Avoidance (i.e. formal specification/verification, automatic program synthesis,
reusable modules)
The problem with life testing is that in order to measure ultrareliability one must test for exorbitant
amounts of time. For example, to measure a 10
-9
probability of failure for a 1-hour mission one must test
for more than 114,000 years.
Many advocate design diversity as a means to overcome the limitations of testing. The basic idea is to use
separate design/implementation teams to produce multiple versions from the same specification. Then,
FAA System Safety Handbook, Appendix D
December 30, 2000
D - 5
non-exact threshold voters are used to mask the effect of a design error in one of the versions. The hope is
that the design flaws will manifest errors independently or nearly so.
By assuming independence one can obtain ultra-reliable-level estimates of reliability even though the
individual versions have failure rates on the order of 10
-4
. Unfortunately, the independence assumption
has been rejected at the 99% confidence level in several experiments for low reliability software.
Furthermore, the independence assumption cannot ever be validated for high reliability software because
of the exorbitant test times required. If one cannot assume independence then one must measure
correlations. This is infeasible as well---it requires as much testing time as life-testing the system because
the correlations must be in the ultra-reliable region in order for the system to be ultra-reliable. Therefore,
it is not possible, within feasible amounts of testing time, to establish that design diversity achieves ultrareliability.
Consequently, design diversity can create an illusion of ultra-reliability without actually providing it.
It is felt that formal methods currently offer the only intellectually defensible method for handling the
design fault problem. Because the often quoted 1 - 10
-9
reliability is well beyond the range of
quantification, there is no choice but to develop life-critical systems in the most rigorous manner available
to us, which is the use of formal methods.
WHAT ARE FORMAL METHODS?
Traditional engineering disciplines rely heavily on mathematical models and calculation to make
judgments about designs. For example, aeronautical engineers make extensive use of computational fluid
dynamics (CFD) to calculate and predict how particular airframe designs will behave in flight. We use the
term formal methods to refer to the variety of mathematical modeling techniques that are applicable to
computer system (software and hardware) design. That is, formal methods is the applied mathematics
engineering and, when properly applied, can serve a role in computer system design.
Formal methods may be used to specify and model the behavior of a system and to mathematically verify
that the system design and implementation satisfy system functional and safety properties. These
specifications, models, and verifications may be done using a variety of techniques and with various
degrees of rigor. The following is an imperfect, but useful, taxonomy of the degrees of rigor in formal
methods:
Level-1: Formal specification of all or part of the system.
Level-2: Formal specification at two or more levels of abstraction and paper and pencil proofs that
the detailed specification implies the more abstract specification.
Level-3: Formal proofs checked by a mechanical theorem prover.
Level 1 represents the use of mathematical logic or a specification language that has a formal semantics to
specify the system. This can be done at several levels of abstraction. For example, one level might
enumerate the required abstract properties of the system, while another level describes an implementation
that is algorithmic in style.
Level 2 formal methods goes beyond Level 1 by developing pencil-and-paper proofs that the more
concrete levels logically imply the more abstract-property oriented levels. This is usually done in the
manner illustrated below.
Level 3 is the most rigorous application of formal methods. Here one uses a semi-automatic theorem
prover to make sure that all of the proofs are valid. The Level 3 process of convincing a mechanical
FAA System Safety Handbook, Appendix D
December 30, 2000
D - 6
prover is really a process of developing an argument for an ultimate skeptic who must be shown every
detail.
Formal methods is not an all-or-nothing approach. The application of formal methods to only the most
critical portions of a system is a pragmatic and useful strategy. Although a complete formal verification
of a large complex system is impractical at this time, a great increase in confidence in the system can be
obtained by the use of formal methods at key locations in the system.
D.3.1 Formal Inspections of Specifications
Formal inspections and formal analysis are different. Formal Inspections should be performed within
every major step of the software development process.
Formal Inspections, while valuable within each design phase or cycle, have the most impact when applied
early in the life of a project, especially the requirements specification and definition stages of a project.
Studies have shown that the majority of all faults/failures, including those that impinge on safety, come
from missing or misunderstood requirements. Formal Inspection greatly improves the communication
within a project and enhances understanding of the system while scrubbing out many of the major
errors/defects.
For the Formal Inspections of software requirements, the inspection team should include representatives
from Systems Engineering, Operations, Software Design and Code, Software Product Assurance, Safety,
and any other system function that software will control or monitor. It is very important that software
safety be involved in the Formal Inspections.
It is also very helpful to have inspection checklists for each phase of development that reflect both generic
and project specific criteria. The requirements discussed in this section and in Robyn R. Lutz's paper
"Targeting Safety-Related Errors During Software Requirements Analysis" will greatly aid in establishing
this checklist. Also, the checklists provided in the NASA Software Formal Inspections Guidebook are
helpful.
D.3.2 Timing, Throughput And Sizing Analysis
Timing and sizing analysis for safety critical functions evaluates software requirements that relate to
execution time and memory allocation. Timing and sizing analysis focuses on program constraints.
Typical constraint requirements are maximum execution time and maximum memory usage. The safety
organization should evaluate the adequacy and feasibility of safety critical timing and sizing
requirements. These analyses also evaluate whether adequate resources have been allocated in each case,
under worst case scenarios. For example, will I/O channels be overloaded by many error messages,
preventing safety critical features from operating.
Quantifying timing/sizing resource requirements can be very difficult. Estimates can be based on the
actual parameters of similar existing systems.
Items to consider include:
¡¤ memory usage versus availability;
¡¤ I/O channel usage (load) versus capacity and availability;
¡¤ execution times versus CPU load and availability;
¡¤ sampling rates versus rates of change of physical parameters.
FAA System Safety Handbook, Appendix D
December 30, 2000
D - 7
In many cases it is difficult to predict the amount of computing resources required. Hence, making use
of past experience is important.
D.3.3 Memory usage versus availability
Assessing memory usage can be based on previous experience of software development if there is
sufficient confidence. More detailed estimates should evaluate the size of the code to be stored in the
memory, and the additional space required for storing data and scratchpad space for storing interim and
final results of computations. Memory estimates in early program phases can be inaccurate, and the
estimates should be updated and based on prototype codes and simulations before they become realistic.
Dynamic Memory Allocation can be viewed as either a practical memory run time solution or as a
nightmare for assuring proper timing and usage of critical data. Any suggestion of Dynamic Memory
Allocation, common in OOD, CH environments, should be examined very carefully; even in ¡°noncritical¡± functional modules.
D.3.3.1 I/O channel usage (Load) versus capacity and availability
Address I/O for science data collection, housekeeping and control. Evaluate resource conflicts between
science data collection and safety critical data availability. During failure events, I/O channels can be
overloaded by error messages and these important messages can be lost or overwritten. (e.g. the British
¡°Piper Alpha¡± offshore oil platform disaster). Possible solutions includes, additional modules designed to
capture, correlate and manage lower level error messages or errors can be passed up through the calling
routines until at a level which can handle the problem; thus, only passing on critical faults or
combinations of faults, that may lead to a failure.
Execution times versus CPU load and availability. Investigate time variations of CPU load, determine
circumstances of peak load and whether it is acceptable. Consider multi-tasking effects. Note that
excessive multi-tasking can result in system instability leading to ¡°crashes¡±.
D.3.3.2 Sampling rates versus rates of change of physical parameters
Analysis should address the validity of the system performance models used, together with simulation and
test data, if available.
System Safety Principles
FAA System Safety Handbook, Appendix E: System Safety PrinciplesDecember 30, 2000
E-1
Appendix E
System Safety Principles
FAA System Safety Handbook, Appendix E: System Safety Principles
December 30, 2000
E-2
System Safety
Principles
• System safety is a basic requirement of the total system.
• System safety must be planned
- Integrated and comprehensive safety engineering effort
- Interrelated, sequential, and continuing effort
- Plan must influence facilities, equipment, procedures, and
personnel
- Applicable to all program phases
- Covers transportation and logistics support
- Covers storage, packaging, and handling
- Covers Non-Development Items (NDI).
• MA provides management of system safety effort
Managerial and technical procedures to be used must be for
MA approval.
- Resolves conflicts between safety and other design
requirements
- Resolves conflicts between associate contractors.
• Design safety precedence:
- Design to minimum hazard
- Use safety devices
- Use warning devices
- Use special procedures.
• System Safety requirements must be consistent with other program
requirements.
Performance, cost, etc., requirements may have priority over safety
Requirements.
• System analyses are basic tools for systematically developing design
specifications.
Ultimate measure of safety is not the scope of analysis but in satisfied
Requirements.
- Analyses are performed to:
¡ì Identify hazards and corrective actions
¡ì Review safety considerations in tradeoffs
¡ì Determine/evaluate safety design requirements
¡ì Determine/evaluate operational, test, logistics
requirements
¡ì Validate qualitative/quantitative requirements
have been met.
- Analyses are hazard not safety analyses
FAA System Safety Handbook, Appendix E: System Safety Principles
December 30, 2000
E-3
• Level of risk assumption and criteria are an inherent part of risk
management.
• Safety Management
- Defines functions, authority, and interrelationships
- Exercises appropriate controls.
• Degree of safety effort and achievements are directly dependent
upon management emphasis by the FAA and contractors.
• Results of safety effort depend upon MA clearly stating safety
objectives/requirements.
• MA responsibilities:
- Plan, organize, and implement SSP
- Establish safety requirements for system design
- State safety requirements in contract
- Requirements for activities in Statement of Work (SOW)
- Review and insure adequate and complete system safety
program plan (SSPP)
- Supply historical data
- Review contractor system safety effort/data
- Ensure specifications are updated with test analyses results
- Establish and operate system safety groups.
• Software hazard analyses are a flow down requirements process
followed by an upward flow verification process
• Four elements of an effective SSP:
- Planned approach to accomplish tasks
- Qualified people
- Authority to implement tasks through all levels of
management
- Appropriate manning/funding.
ORM Details and Examples
FAA System Safety Handbook, Appendix FDecember 30, 2000
F-1
Appendix F
ORM Details and Examples
FAA System Safety Handbook, Appendix F
December 30, 2000
F-2
1.0 HAZARD IDENTIFICATION TOOLS, DETAILS AND EXAMPLES
Chapter 15 summarizes the Operational Risk Management methodology. This Appendix provides
examples of those tools, as they are applied to the ORM process:
¡¤ Hazard Identification
¡¤ Risk Assessment
¡¤ Risk Control Option Analysis
¡¤ Risk Control Decisions
¡¤ Risk Control Implementation
¡¤ Supervision and Review
1.1 PRIMARY HAZARD IDENTIFICATION TOOLS
The seven described in this appendix are considered the basic set of hazard identification tools to be
applied on a day-to-day basis in organizations at all levels. These tools have been chosen for the following
reasons:
They are simple to use, though they require some training.
They have been proven effective.
Widespread application has demonstrated they can and will be used by operators and will consistently be
perceived as positive.
As a group, they complement each other, blending the intuitive and experiential with the more structured
and rigorous.
They are well supported with worksheets and job aids.
In an organization with a mature ORM culture, the use of these tools by all personnel will be regarded as
the natural course of events. The norm will be ¡°Why would I even consider exposing myself and others to
the risks of this activity before I have identified the hazards involved using the best procedures or designs
available?¡± The following pages describe each tool using a standard format with models and examples.
1.1.1 THE OPERATIONS ANALYSIS AND FLOW DIAGRAM
FORMAL NAME: The Operations Analysis
ALTERNATIVE NAMES: The flow diagram, flow chart, operation timeline
PURPOSE: The Operations Analysis (OA) provides an itemized sequence of events or a flow diagram
depicting the major events of an operation. This assures that all elements of the operation are evaluated as
potential sources of risk. This analysis overcomes a major weaknesses of traditional risk management,
which tends to focus effort on one or two aspects of an operation that are intuitively identified as risky,
often to the exclusion of other aspects that may actually be riskier. The Operations Analysis also guides
the allocation of risk management resources over time as an operation unfolds event by event in a
systematic manner.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-3
APPLICATION: The Operations Analysis or flow diagram is used in nearly all risk management
applications, including the most time-critical situations. It responds to the key risk management question
¡°What am I facing here and from where can risk arise?¡±
METHOD: Whenever possible, the Operations Analysis is taken directly from the planning of the
operation. It is difficult to imagine planning an operation without identifying the key events in a time
sequence. If for some reason such a list is not available, the analyst creates it using the best available
understanding of the operation. The best practice is to break down the operation into time-sequenced
segments strongly related by tasks and activities. Normally, this is well above the detail of individual
tasks. It may be appropriate to break down aspects of an operation that carry obviously higher risk into
more detail than less risky areas. The product of an OA is a compilation of the major events of an
operation in sequence, with or without time checks. An alternative to the Operations Analysis is the flow
diagram. Commonly used symbols are provided at Figure 1.1.1A. Putting the steps of the process on
index cards or sticky-back note paper allows the diagram to be rearranged without erasing and redrawing,
thus encouraging contributions.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-4
Figure 1.1.1A Example Flow Chart Symbols
SYMBOL REPRESENTS EXAMPLE
START
RECEIVE TASKING
BEGIN TRIP
OPEN CHECKLIST
ACTIVITY
OPERATION PLANNING
START CAR
STEP ONE IN CHECKLIST
DECISION POINT
(OR)
YES/NO
APPROVE/DISAPPROVE
PASS/FAIL
FORK / SPLIT
(AND)
PREPOSTION VEHICLES AND SUPPLIES
RELEASE CLUTCH AND PRESS
ACCELERATOR OBSERVE FLIGHT
CONTROLS WHILE MOVING STICK
END
FINAL REPORT
ARRIVE AT DESTINATION
AIRCRAFT ACCEPTED
RESOURCES: The key resource for the Operations Analysis are the operational planners. Using their
operational layout will facilitate the integration of risk controls in the main operational plan and will
eliminate the expenditure of duplicate resources on this aspect of hazard identification.
COMMENTS: Look back on your own experience. How many times have you been surprised or seen
others surprised because they overlooked possible sources of problems? The OA is the key to minimizing
this source of accidents.
THE PLANNING PHASE
If more detail and more structured examination of the operational flow are desired, the flow diagram can
be used. This diagram will add information through the use of graphic symbols. A flow diagram of the
planning phase above might be developed as illustrated in Figure 1.1.1B below.
¡¤ Initial Intelligence Received (Maps, Facility Lists, Environment, Etc.
¡¤ Advance Party Dispatched
¡¤ Advance Party Data Received
¡¤ Deployment Planning Underway
¡¤ Deployment Preparations Initiated
¡¤ Initial Operation Planning Underway
¡¤ Contingency Planning Underway
FAA System Safety Handbook, Appendix F
December 30, 2000
F-5
Figure 1.1.1B Example Flow Diagram
Intelligence
Tasks
Gather initial
Intelligence
Dispatch advance
team
Deployment
planning Initial planning
Contingency
planning
Plans complete Start
The flow diagram can be used
as an ORM planning tool.
Indicate ORM actions in
connection with each activity.
Get ORM data
Protect the Team
1.1.2 THE PRELIMINARY HAZARD ANALYSIS
FORMAL NAME: Preliminary Hazard Analysis
ALTERNATIVE NAMES: The PHA, the PHL
PURPOSE: The PHA provides an initial overview of the hazards present in the overall flow of the
operation. It provides a hazard assessment that is broad, but usually not deep. The key idea of the PHA is
to consider the risk inherent to every aspect of an operation. The PHA helps overcome the tendency to
focus immediately on risk in one aspect of an operation, sometimes at the expense of overlooking more
serious issues elsewhere in the operation. The PHA will often serve as the hazard identification process
when risk is low or routine. In higher risk operations, it serves to focus and prioritize follow-on hazard
analyses by displaying the full range of risk issues.
APPLICATION: The PHA is used in nearly all risk management applications except the most timecritical. Its broad scope is an excellent guide to the identification of issues that may require more detailed
hazard identification tools.
METHOD: The PHA is usually based on the Operations Analysis or flow diagram, taking each event in
turn from it. Analysts apply their experience and intuition, use reference publications and standards of
various kinds, and consult with personnel who may have useful input. The extent of the effort is dictated
by resource and time limitations, and by the estimate of the degree of overall risk inherent in the
operation. Hazards that are detected are often listed directly on a copy of the Operations Analysis as
shown at Figure 1.1.2A. Alternatively, a more formal PHA format such as the worksheet shown at Figure
1.1.2B can be used. Operations Analysis. The completed PHA is used to identify hazards requiring more
in-depth hazard identification or it may lead directly to the remaining five steps of the ORM process, if
FAA System Safety Handbook, Appendix F
December 30, 2000
F-6
hazard levels are judged to be low. Key to the effectiveness of the PHA is assuring that all events of the
operation are covered.
Figure 1.1.2A Building the PHA directly From the Operations Analysis Flow Diagram
Operational Phase Hazards
RESOURCES: The two key resources for the PHA are the expertise of personnel actually experienced in
the operation and the body of regulations, standards, and instructions that may be available. The PHA
can be accomplished in small groups to broaden the List the operational phases vertically down the page.
Be sure to leave plenty of space on the worksheet between each phase to allow several hazards to be noted
for each phase. List the hazards noted for each operational phase. Strive for detail within the limits
imposed by time. A copy of a PHA accomplished for an earlier similar operation would aid in the
process.
COMMENTS: The PHA is relatively easy to use and takes little time. Its significant power to impact
risk arises from the forced consideration of risk in all phases of an operation. This means that a key to
success is to link the PHA closely to the Operations Analysis.
EXAMPLES: The following (Figure 1.1.2B) is an example of a PHA.
List the operational phases
vertically down the page. Be sure
to leave plenty of space on the
worksheet between each phase to
allow several hazards to be noted
List the hazards noted for each
operational phase here. Strive for
detail within the limits imposed by the
time you have set aside for this tool.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-7
Figure 1.1.2B Example PHA
MOVING A HEAVY PIECE OF EQUIPMENT
The example below uses an operation analysis for moving a heavy piece of equipment as the start
point and illustrates the process of building the PHA direct from the Operations Analysis.
Operation: Move a 3-ton machine from one building to another.
Start Point: The machine is in its original position in building A
End Point: The machine is in its new position in building B
ACTIVITY / EVENT HAZARD
Raise the machine to permit positioning of
the forklift
Machine overturns due to imbalance
Machine overturns due to failure of lifting device
Machine drops on person or equipment due to failure
of lifting device or improper placement (person lifting
device)
Machine strikes overhead obstacle
Machine is damaged by the lifting process
Position the forklift Forklift strikes the machine
Forklift strikes other items in the area
Lift the machine Machine strikes overhead obstacle
Lift fails due to mechanical failure (damage to
machine, objects, or people)
Machine overturns due to imbalance
Move machine to the truck Instability due to rough surface or weather condition
Operator error causes load instability
The load shifts
Place machine on the truck Improper tiedown produces instability
Truck overloaded or improper load distribution
Drive truck to building B Vehicle accident during the move
Poor driving technique produces instability
Instability due to road condition
Remove machine from the truck Same factors as ¡°Move it to the truck¡±
Place machine in proper position in
building B
Same factors as ¡°Raise the machine¡± except focused
on lowering the machine
1.1.3 THE ""WHAT IF"" TOOL
FORMAL NAME: The ¡°"What If"¡± tool
ALTERNATIVE NAMES: None.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-8
PURPOSE: The "What If" tool is one of the most powerful hazard identification tools. As in the case of
the Scenario Process tool, it is designed to add structure to the intuitive and experiential expertise of
operational personnel. The "What If" tool is especially effective in capturing hazard data about failure
modes that may create hazards. It is somewhat more structured than the PHA. Because of its ease of use,
it is probably the single most practical and effective tool for use by operational personnel.
APPLICATION: The "What If" tool should be used in most hazard identification applications, including
many time-critical applications. A classic use of the "What If" tool is as the first tool used after the
Operations Analysis and the PHA. For example, the PHA reveals an area of hazard that needs additional
investigation. The best single tool to further investigate that area will be the ¡°What If¡± tool. The user will
zoom in on the particular area of concern, add detail to the OA in this area and then use the "What If"
procedure to identify the hazards.
METHOD: Ensure that participants have a thorough knowledge of the anticipated flow of the operation.
Visualize the expected flow of events in time sequence from the beginning to the end of the operation.
Select a segment of the operation on which to focus. Visualize the selected segment with "Murphy"
injected. Make a conscious effort to visualize hazards. Ask, "what if various failures occurred or
problems arose¡±? Add hazards and their causes to your hazard list and assess them based on probability
and severity.
The "What-If" analysis can be expanded to further explore the hazards in an operation by developing
short scenarios that reflect the worst credible outcome from the compound effects of multiple hazards in
the operation.
Follow these guidelines in writing scenarios:
¡¤ Target length is 5 or 6 sentences, 60 words
¡¤ Don't dwell on grammatical details
¡¤ Include elements of Mission, Man, Machine, Management, and Media
¡¤ Start with history
¡¤ Encourage imagination and intuition
¡¤ Carry the scenario to the worst credible outcome
¡¤ Use a single person or group to edit
RESOURCES: A key resource for the "What If" tool is the Operations Analysis. It may be desirable to
add detail to it in the area to be targeted by the "What If" analysis. However, in most cases an OA can be
used as-is, if it is available. The "What If" tool is specifically designed to be used by personnel actually
involved in an operation. Therefore, the most critical what if resource is the involvement of operators and
their first lines supervisors. Because of its effectiveness, dynamic character, and ease of application, these
personnel are generally quite willing to support the "What If" process.
COMMENTS: The "What If" tool is so effective that the Occupational Safety and Health
Administration (OSHA) has designated as it one of six tools from among which activities facing
catastrophic risk situations must choose under the mandatory hazard analysis provisions of the process
safety standard.
EXAMPLES: Following (Figure 1.1.3A) is an extract from the typical output from the "What If" tool.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-9
Figure 1.1.3A Example What If Analysis
Situation: Picture a group of 3 operational employees informally applying the round robin
procedure for the "What If" tool to a task to move a multi-ton machine from one location to
another. A part of the discussion might go as follows:
Joe: What if the machine tips over and falls breaking the electrical wires that run within the walls
behind it?
Bill: What if it strikes the welding manifolds located on the wall on the West Side? (This
illustrates ¡°piggybacking¡± as Bill produces a variation of the hazard initially presented by Joe).
Mary: What if the floor fails due to the concentration of weight on the base of the lifting
device?
Joe: What if the point on the machine used to lift it is damaged by the lift?
Bill: What if there are electrical, air pressure hoses, or other attachments to the machine that are
not properly neutralized?
Mary: What if the lock out/tag out is not properly applied to energy sources servicing the
machine? And so on....
Note: The list above for example might be broken down as follows:
Group 1: Machine falling hazards
Group 2: Weight induced failures
Group 3: Machine disconnect and preparation hazards
These related groups of hazards are then subjected to the remaining five steps of the ORM
process.
1.1.4 THE SCENARIO PROCESS TOOL
FORMAL NAME: The Scenario Process tool
ALTERNATIVE NAMES: The mental movie tool.
PURPOSE: The Scenario Process tool is a time-tested procedure to identify hazards by visualizing them.
It is designed to capture the intuitive and experiential expertise of personnel involved in planning or
executing an operation, in a structured manner. It is especially useful in connecting individual hazards
into situations that might actually occur. It is also used to visualize the worst credible outcome of one or
more related hazards, and is therefore an important contributor to the risk assessment process.
APPLICATION: The Scenario Process tool should be used in most hazard identification applications,
including some time-critical applications. In the time-critical mode, it is indeed one of the few practical
FAA System Safety Handbook, Appendix F
December 30, 2000
F-10
tools, in that the user can quickly form a ¡°mental movie¡± of the flow of events immediately ahead and the
associated hazards.
METHOD: The user of the Scenario Process tool attempts to visualize the flow of events in an
operation. This is often described as constructing a ¡°mental movie¡±. It is often effective to close the eyes,
relax and let the images flow. Usually the best procedure is to use the flow of events established in the
OA. An effective method is to visualize the flow of events twice. The first time, see the events as they are
intended to flow. The next time, inject ¡°Murphy¡± at every possible turn. As hazards are visualized, they
are recorded for further action. Some good guidelines for the development of scenarios are as follows:
Limit them to 60 words or less. Don¡¯t get tied up in grammatical excellence (in fact they don¡¯t have to be
recorded at all). Use historical experience but avoid embarrassing anyone. Encourage imagination (this
helps identify risks that have not been previously encountered). Carry scenarios to the worst credible
event.
RESOURCES: The key resource for the Scenario Process tool is the Operations Analysis. It provides
the script for the flow of events that will be visualized. Using the tool does not require a specialist.
Operational personnel leading or actually performing the task being assessed are key resources for the
OA. Using this tool is often entertaining, dynamic and often motivates even the most junior personnel in
the organization.
COMMENTS: A special value of the Scenario Process tool is its ability to link two or more individual
hazards developed using other tools into an operation relevant scenario.
EXAMPLES. Following is an example (Figure 1.1.4A) of how the Scenario Process tool might be used in
an operational situation.
Figure 1.1.4A Example Machine Movement Scenario
1.1.5 THE LOGIC DIAGRAM
FORMAL NAME: The Logic Diagram
ALTERNATIVE NAMES: The Logic Tree
PURPOSE: The Logic Diagram is intended to provide considerable structure and detail as a primary
hazard identification procedure. Its graphic structure is an excellent means of capturing and correlating
FROM MACHINE MOVEMENT EXAMPLE: As the machine was being jacked-up to
permit placement of the forklift, the fitting that was the lift point on the machine broke. The
machine tilted in that direction and fell over striking the nearby wall. This in turn broke a
fuel gas line in the wall. The gas was turned off as a precaution, but the blow to the metal
line caused the valve to which it was attached to break, releasing gas into the atmosphere.
The gas quickly reached the motor of a nearby fan (not explosion proof) and a small
explosion followed. Several personnel were badly burned and that entire section of the shop
was badly damaged. The shop was out of action for 3 weeks.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-11
the hazard data produced by the other primary tools. Because of its graphic display, it can also be an
effective hazard-briefing tool. The more structured and logical nature of the Logic Diagram adds
substantial depth to the hazard identification process to complement the other more intuitive and
experiential tools. Finally, an important purpose of the Logic Diagram is to establish the connectivity and
linkages that often exist between hazards. It does this very effectively through its tree-like structure.
APPLICATION: Because it is more structured, the Logic Diagram requires considerable time and effort
to accomplish. Following the principles of ORM, its use will be more limited than the other primary tools.
This means limiting its use to higher risk issues. By its nature it is also most effective with more
complicated operations in which several hazards may be interlinked in various ways. Because it is more
complicated than the other primary tools, it requires more practice, and may not appeal to all operational
personnel. However, in an organizational climate committed to ORM excellence, the Logic Diagram will
be a welcomed and often used addition to the hazard identification toolbox.
METHOD: There are three types of Logic Diagrams. These are the:
Positive diagram. This variation is designed to highlight the factors that must be in place if risk is to be
effectively controlled in the operation. It works from a safe outcome back to the factors that must be in
place to produce it.
Event diagram. This variation focuses on an individual operational event (often a failure or hazard
identified using the "What If" tool) and examines the possible consequences of the event. It works from an
event that may produce risk and shows what the loss outcomes of the event may be.
Negative diagram. This variation selects a loss event and then analyzes the various hazards that could
combine to produce that loss. It works from an actual or possible loss and identifies what factors could
produce it.
All of the various Logic Diagram options can be applied either to an actual operating system or one being
planned. Of course, the best time for application is in the planning stages of the operational lifecycle. All
of the Logic Diagram options begin with a top block. In the case of the positive diagram, this is a desired
outcome; in the case of the event diagram, this is an operations event or contingency possibility; in the
case of the negative diagram, it is a loss event. When working with positive diagram or negative diagram,
the user then, reasons out the factors that could produce the top event. These are
entered on the next line of blocks. With the event diagram, the user lists the possible results of the event
being analyzed. The conditions that could produce the factors on the second line are then considered and
they are entered on the third line. The goal is to be as logical as possible when constructing Logic
Diagrams, but it is more important to keep the hazard identification goal in mind than to construct a
masterpiece of logical thinking. Therefore, a Logic Diagram should be a worksheet with lots of changes
and variations marked on it. With the addition of a chalkboard or flip chart, it becomes an excellent group
tool.
Figure 1.1.5A below is a generic diagram, and it is followed by a simplified example of each of the types
of Logic Diagrams (Figures 1.1.5B, 1.1.5C, 1.1.5D).
FAA System Safety Handbook, Appendix F
December 30, 2000
F-12
Figure 1.1.5A Generic Logic Diagram
EVENT
PRIMARY
CAUSE
SUPPORTING
CAUSE
ROOT CAUSE
PRIMARY
CAUSE
PRIMARY
CAUSE
SUPPORTING
CAUSE
SUPPORTING
CAUSE
SUPPORTING
CAUSE
ROOT CAUSE
Figure 1.1.5B Positive Event Logic Diagram
ETC. ETC.
TIEDOWN PROPERLY
ACCOMPLISHED
CLEAR
PROCEDURES
GOOD
MOTIVATION
GOOD
TRAINING
CONTAINER STAYS
ON VEHICLE
FAA System Safety Handbook, Appendix F
December 30, 2000
F-13
Figure 1.1.5C Risk Event Diagram
FORKLIFT PROCEDURES
VIOLATED-EXCEEDED
LIFT CAPACITY
ETC.
LIFT MECHANISM
FAILS, LIFT FAILS ETC.
LOAD BOUNCES
TO THE GROUND
CONTAINER RUPTURES,
CHEMICAL AGENT
LEAKS
FAA System Safety Handbook, Appendix F
December 30, 2000
F-14
Figure 1.1.5D Negative Event Logic Diagram
CONTAINER FALLS
OFF VEHICLE &
RUPTURES
ETC.
FAILURE OF
TIEDOWN GEAR ETC.
FAILURE TO INSPECT
& TEST TIEDOWNS
IAW PROCEDURES
VARIOUS
ROOT CAUSES
ETC.
RESOURCES: All of the other primary tools are key resources for the Logic Diagram, as it can
correlate hazards that they generate. If available, a safety professional may be an effective facilitator for
the Logic Diagram process.
COMMENTS: The Logic Diagram is the most comprehensive tool available among the primary
procedures. Compared to other approaches to hazard identification, it will substantially increase the
quantity and quality of hazards identified.
EXAMPLE: Figure 1.1.5E illustrates how a negative diagram could be constructed for moving a heavy
piece of equipment.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-15
Figure 1.1.5E Example Negative Diagram
Machine fails when
raised by the forklift
Machine strikes an
overhead obstacle
and tilts
The load shifts due
to lift point or
failure to secure
Improper operator
technique (jerky,
bad technique)
Load is too heavy
for the forklift
Mechanical failure
of
the forklift
The machine
breaks at the point
of lift
Improper operator
technique (jerky,
bad technique)
Improper operator
technique (jerky,
bad technique)
Improper operator
technique (jerky,
bad technique)
Improper operator
technique (jerky,
bad technique)
Improper operator
technique (jerky,
bad technique)
Each of these items may be taken to a third level. For example:
The Logic Diagram pulls together all sources of hazards and displays them in a graphic
format that clarifies the risk issues.
1.1.6 THE CHANGE ANALYSIS
FORMAL NAME: The Change Analysis
ALTERNATIVE NAMES: None
PURPOSE: Change is an important source of risk in operational processes.
Figure 1.1.6A illustrates this causal relationship.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-16
Figure 1.1.6A Change Causation
System
Impacted
Stress is
Created
Risk Controls
Overcome
Risk
Increases
Losses
Increase
Introduce
Change
Some changes are planned, but many others occur incrementally over time, without any conscious
direction. The Change Analysis is intended to analyze the hazard implications of either planned or
incremental changes. The Change Analysis helps to focus only on the changed aspects of the operation,
thus eliminating the need to reanalyze the total operation, just because a change has occurred in one area.
The Change Analysis is also used to detect the occurrence of change. By periodically comparing current
procedures with previous ones, unplanned changes are identified and clearly defined. Finally, Change
Analysis is an important accident investigation tool. Because many incidents/accidents are due to the
injection of change into systems, an important investigative objective is to identify these changes using the
Change Analysis procedure.
APPLICATION: Change analysis should be routinely used in the following situations.
Whenever significant changes are planned in operations in which there is significant operational risk of
any kind. An example is the decision to conduct a certain type of operation at night that has heretofore
only been done in daylight.
Periodically in any important operation, to detect the occurrence of unplanned changes.
As an accident investigation tool.
As the only hazard identification tool required when an operational area has been subjected to in-depth
hazard analysis, the Change Analysis will reveal whether any elements exist in the current operations that
were not considered in the previous in-depth analysis.
METHOD: The Change Analysis is best accomplished using a format such as the sample worksheet
shown at Figure 1.1.6B. The factors in the column on the left side of this tool are intended as a
comprehensive change checklist.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-17
Figure 1.1.6B Sample Change Analysis Worksheet
Target: ________________________________ Date: ______________________
FACTORS EVALUATED
SITUATION
COMPARABLE
SITUATION
DIFFERENCE SIGNIFICANCE
WHAT
Objects
Energy
Defects
Protective
Devices
WHERE
On the object
In the process
Place
WHEN
In time
In the process
WHO
Operator
Fellow worker
Supervisor
Others
TASK
Goal
Procedure
Quality
WORKING
CONDITIONS
Environmental
Overtime
Schedule
Delays
TRIGGER
EVENT
MANAGERIAL
CONTROLS
Control Chain
Hazard Analysis
Monitoring
Risk Review
To use the worksheet: The user starts at the top of the column and considers the current situation compared
to a previous situation and identifies any change in any of the factors.
When used in an accident investigation, the accident situation is compared to a previous baseline.
The significance of detected changes can be evaluated intuitively or they can be subjected to "What If",
Logic Diagram, or scenario, other specialized analyses.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-18
RESOURCES: Experienced operational personnel are a key resource for the Change Analysis tool.
Those who have long-term involvement in an operational process must help define the ¡°comparable
situation.¡± Another important resource is the documentation of process flows and task analyses. Large
numbers of such analyses have been completed in recent years in connection with quality improvement
and reengineering projects. These materials are excellent definitions of the baseline against which change
can be evaluated.
COMMENTS: In organizations with mature ORM processes, most, if not all, higher risk activities will
have been subjected to thorough ORM applications and the resulting risk controls will have been
incorporated into operational guidance. In these situations, the majority of day-to-day ORM activity will
be the application of Change Analysis to determine if the operation has any unique aspects that have not
been previously analyzed.
1.1.7 THE CAUSE AND EFFECT TOOL
FORMAL NAME: The Cause and Effect Tool
ALTERNATIVE NAMES: The cause and effect diagram. The fishbone tool, the Ishikawa Diagram
PURPOSE: The Cause and Effect Tool is a variation of the Logic Tree tool and is used in the same
hazard identification role as the general Logic Diagram. The particular advantage of the Cause and Effect
Tool is its origin in the quality management process and the thousands of personnel who have been
trained in the tool. Because it is widely used, thousands of personnel are familiar with it and therefore
require little training to apply it to the problem of detecting risk.
APPLICATION: The Cause and Effect Tool will be effective in organizations that have had some
success with the quality initiative. It should be used in the same manner as the Logic Diagram and can be
applied in both a positive and negative variation.
METHOD: The Cause And Effect diagram is a Logic Diagram with a significant variation. It provides
more structure than the Logic Diagram through the branches that give it one of its alternate names, the
fishbone diagram. The user can tailor the basic ¡°bones¡± based upon special characteristics of the
operation being analyzed. Either a positive or negative outcome block is designated at the right side of the
diagram. Using the structure of the diagram, the user completes the diagram by adding causal factors in
either the ¡°M¡± or ¡°P¡± structure. Using branches off the basic entries, additional hazards can be added.
The Cause And Effect diagram should be used in a team setting whenever possible.
RESOURCES: There are many publications describing in great detail how to use cause and effect
diagrams.
1
COMMENTS:
EXAMPLES: An example of Cause and Effect Tool in action is illustrated at Figure 1.1.7A.
1
K. Ishikawa, Guide to Quality Control, Quality Resources, White Plains, New York, 12
th
Printing 1994.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-19
Figure 1.1.7 Example of Cause and Effect
SITUATION: The supervisor of an aircraft maintenance operation has been receiving reports from Quality
Assurance regarding tools in aircraft after maintenance over the last six months. The supervisor has followed up
but each case has involved a different individual and his spot checks seem to indicate good compliance with tool
control procedures. He decides to use a cause and effect diagram to consider all the possible sources of the tool
control problem. The supervisor develops the cause and effect diagram with the help of two or three of his best
maintenance personnel in a group application.
NOTE: Tool control is one of the areas where 99% performance is not adequate. That would mean one in a
hundred tools are misplaced. The standard must be that among the tens (or hundreds) of thousands of individual
uses of tools over a year, not one is misplaced.
Motivation weak (reward, discipline) OI incomplete (lacks detail)
Training weak (procedures, consequences) Tool check procedures weak
Supervision weak (checks)
Management emphasis light
No tool boards, cutouts Many small, hard to see tools
Many places to lose tools in aircraft
Participate in development of new procedures Collective & individual awards
Self & coworker observation Detailed OI
Quick feedback on mistakes Good matrices
Commitment to excellence
Strong sustained emphasis Extensive use of toolboard cutouts
Using the positive diagram as a guide the supervisor and working group apply all possible and practical options
developed from it.
1.2 THE SPECIALTY HAZARD IDENTIFICATION TOOLS
The tools that follow are designed to augment the primary tools described in part 1.1. These tools have
several advantages:
Methods Human
Materials Machinery
Tool
misplaced
People Procedures
Policies Plant
Strong
Motivation
FAA System Safety Handbook, Appendix F
December 30, 2000
F-20
They can be used by nearly everyone in the organization, though some may require either training or
professional facilitation.
Each tool provides a capability not fully realized in any of the primary tools.
They use the tools of the less formal safety program to support the ORM process.
They are well supported with forms, job aids, and models.
Their effectiveness has been proven. In an organization with a mature ORM process, all personnel will be
aware of the existence of these specialty tools and capable of recognizing the need for their application.
While not everyone will be comfortable using every procedure, a number of people within the
organization will have experience applying one or another of them.
1.2.1 THE HAZARD AND OPERABILITY TOOL
FORMAL NAME: The Hazard and Operability Tool
ALTERNATIVE NAMES: The HAZOP analysis
PURPOSE: The special role of the HAZOP is hazard analysis of completely new operations. In these
situations, traditional intuitive and experiential hazard identification procedures are especially weak. This
lack of experience hobbles tools such as the "What If" and Scenario Process tools, which rely heavily on
experienced operational personnel. The HAZOP deliberately maximizes structure and minimizes the need
for experience to increase its usefulness in these situations.
APPLICATION: The HAZOP should be considered when a completely new process or procedure is
going to be undertaken. The issue should be one where there is significant risk because the HAZOP does
demand significant expenditure of effort and may not be cost effective if used against low risk issues. The
HAZOP is also useful when an operator or leader senses that ¡°something is wrong¡± but they can¡¯t
identify it. The HAZOP will dig very deeply into the operation and to identify what that ¡°something¡± is.
METHOD: The HAZOP is the most highly structured of the hazard identification procedures. It uses a
standard set of guide terms (Figure 1.1) which are then linked in every possible way with a tailored set of
process terms (for example ¡°flow¡±). The process terms are developed directly from the actual process or
from the Operations Analysis. The two words together, for example ¡°no¡± (a guideword) and ¡°flow¡± (a
process term) will describe a deviation. These are then evaluated to see if a meaningful hazard is
indicated. If so, the hazard is entered in the hazard inventory for further evaluation. Because of its rigid
process, the HAZOP is especially suitable for one-person hazard identification efforts.
Figure 1.2.1A Standard HAZOP Guidewords
NO
MORE
LESS
REVERSE
LATE
EARLY
Note: This basic set of guidewords should be
all that are needed for all applications.
Nevertheless, when useful, specialized terms
can be added to the list. In less complex
applications only some of the terms may be
needed.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-21
RESOURCES: There are few resources available to assist with HAZOP; none are really needed.
COMMENTS: The HAZOP is highly structured, and often time-consuming. Nevertheless, in its special
role, this tool works very effectively. OSHA selected it for inclusion in the set of six mandated procedures
of the OSHA process safety standard.
1.2.2 THE MAPPING TOOL
FORMAL NAME: The Mapping Tool
ALTERNATIVE NAMES: Map analysis
PURPOSE: The map analysis is designed to use terrain maps and other system models and schematics to
identify both things at risk and the sources of hazards. Properly applied the tool will reveal the following:
Task elements at risk
The sources of risk
The extent of the risk (proximity)
Potential barriers between hazard sources and operational assets
APPLICATION: The Mapping Tool can be used in a variety of situations. The explosive quantitydistance criteria are a classic example of map analysis. The location of the flammable storage is plotted
and then the distance to various vulnerable locations (inhabited buildings, highways, etc.) is determined.
The same principles can be extended to any facility. We can use a diagram of a maintenance shop to note
the location of hazards such as gases, pressure vessels, flammables, etc. Key assets can also be plotted.
Then hazardous interactions are noted and the layout of the facility can be optimized in terms of risk
reduction.
METHOD: The Mapping Tool requires some creativity to realize its full potential. The starting point is
a map, facility layout, or equipment schematic. The locations of hazard sources are noted. The easiest
way to detect these sources is to locate energy sources, since all hazards involve the unwanted release of
energy. Figure 1.2.2A lists the kinds of energy to look for. Mark the locations of these sources on the map
or diagram. Then, keeping the operation in mind, locate the personnel, equipment, and facilities that the
various potentially hazardous energy sources could impact. Note these potentially hazardous links and
enter them in the hazard inventory for risk management.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-22
Figure 1.2.2A Major Types of Energy
Electrical
Kinetic (moving mass e.g. a vehicle, a machine part, a bullet)
Potential (not moving mass e.g. a heavy object suspended overhead)
Chemical (e.g. explosives, corrosive materials)
Noise and Vibration
Thermal (heat)
Radiation (Non-ionizing e.g. microwave, and ionizing e.g. nuclear radiation, x-rays)
Pressure (air, hydraulic, water)
RESOURCES: Maps can convey a great deal of information, but cannot replace the value of an on-site
assessment. Similarly, when working with an equipment schematic or a facility layout, there is no
substitute for an on-site inspection of the equipment or survey of the facility.
COMMENTS: The map analysis is valuable in itself, but it is also excellent input for many other tools
such as the Interface Analysis, Energy Trace and Barrier Analysis, and Change Analysis.
EXAMPLE: The following example (Figure 1.2.2B) illustrates the use of a facility schematic that
focuses on the energy sources there as might be accomplished in support of an Energy Trace and Barrier
Analysis.
SITUATION: A team has been assigned the task of renovating an older facility
for use as a museum for historical aviation memorabilia. They evaluate the facility layout
(schematic below). By evaluating the potential energy sources presented in this
schematic, it is possible to identify hazards that may be created by the operations to be conducted.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-23
Figure 1.2.2B Example Map Analysis
FACILITY ENERGY SOURCES
Electrical throughout
Simplified Facility Diagram
Main electrical
distribution
Area beneath suspended item
Area of paints & flammables storage
Pneumatic lines for old mail distribution
1.2.3 THE INTERFACE ANALYSIS
FORMAL NAME: The Interface Analysis
ALTERNATIVE NAMES: Interface Hazard Analysis
PURPOSE: The Interface Analysis is intended to uncover the hazardous linkages or interfaces between
seemingly unrelated activities. For example, we plan to build a new facility. What hazards may be
created for other operations during construction and after the facility is operational? The Interface
Analysis reveals these hazards by focusing on energy exchanges. By looking at these potential energy
transfers between two different activities, we can often detect hazards that are difficult to detect in any
other way.
APPLICATION: An Interface Analysis should be conducted any time a new activity is being introduced
and there is any chance at all that unfavorable interaction could occur. A good cue to the need for an
Interface Analysis is the use of either the Change Analysis (indicating the injection of something new) or
the map analysis (with the possibility of interactions).
METHOD: The Interface Analysis is normally based on an outline such as the one illustrated at Figure
3.1. The outline provides a list of potential energy types and guides the consideration of the potential
interactions. A determination is made whether a particular type of energy is present and then whether
Areas with
old
Gas lines for
Medical
Areas of
former
Medical
FAA System Safety Handbook, Appendix F
December 30, 2000
F-24
there is potential for that form of energy to adversely affect other activities. As in all aspects of hazard
identification, the creation of a good Operations Analysis is vital.
Figure 1.2.3A The Interface Analysis Worksheet
RESOURCES: Interface Analyses are best accomplished when personnel from all of the involved
activities participate, so that hazards and interfaces in both directions can be effectively and
knowledgeably addressed. A safety office representative can also be useful in advising on the types and
characteristics of energy transfers that are possible.
COMMENTS: The lessons of the past indicate that we should give serious attention to use of the
Interface Analysis. Nearly anyone who has been involved in operations for any length of time can relate
stories of overlooked interfaces that have had serious adverse consequences.
EXAMPLES: An Interface Analysis using the general outline is shown below.
Energy Element
Kinetic (objects in motion)
Electromagnetic (microwave, radio, laser)
Radiation (radioactive, x-ray)
Chemical
Other
Personnel Element: Personnel moving from one area to another
Equipment Element: Machines and material moving from one area to another
Supply/materiel Element:
Intentional movement from one area to another
Unintentional movement from one area to another
Product Element: Movement of product from one area to another
Information Element: Flow of information from one area to another or interference (i.e.
jamming)
Bio-material Element
Infectious materials (virus, bacteria, etc.)
Wildlife
Odors
FAA System Safety Handbook, Appendix F
December 30, 2000
F-25
Figure 1.2.3B Example Interface Analysis
SITUATION: Construction of a heavy equipment maintenance facility is planned for
the periphery of the complex at a major facility. This is a major complex
costing over $2,000,000 and requiring about eight months to complete. The objective is
to detect interface issues in both directions. Notice that the analysis reveals a variety of
interface issues that need to be thought through carefully.
Energy Interface
Movement of heavy construction equipment
Movement of heavy building supplies
Movement of heavy equipment for repair
Possible hazmat storage/use at the facility
Personnel Interface
Movement of construction personnel (vehicle or pedestrian) through base area
Movement of repair facility personnel through base area
Possible movement of base personnel (vehicular or pedestrian) near or through the facility
Equipment Interface: Movement of equipment as indicated above
Supply Interface
Possible movement of hazmat through base area
Possible movement of fuels and gases
Supply flow for maintenance area through base area
Product Interface
Movement of equipment for repair by tow truck or heavy equipment transport through the base area
Information Interface
Damage to buried or overhead wires during construction or movement of equipment
Possible Electro-magnetic interference due to maintenance testing, arcing, etc.
Biomaterial Interface: None
1.2.4 THE ACCIDENT/INCIDENT ANALYSIS
FORMAL NAME: The Accident/Incident Analysis
ALTERNATIVE NAMES: The accident analysis
PURPOSE: Most organizations have accumulated extensive, detailed databases that are gold mines of
risk data. The purpose of the analysis is to apply this data to the prevention of future accidents or
incidents.
APPLICATION: Every organization should complete an operation incident analysis annually. The
objective is to update the understanding of current trends and causal factors. The analysis should be
completed for each organizational component that is likely to have unique factors.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-26
METHOD: The analysis can be approached in many ways. The process generally builds a database of
the factors listed below and which serves as the basis to identify the risk drivers
Typical factors to examine include the following:
Activity at the time of the accident
Distribution of incidents among personnel
Accident locations
Distribution of incidents by sub-unit
Patterns of unsafe acts or conditions
RESOURCES: The analysis relies upon a relatively complete and accurate database. The FAA's system
safety office (ASY) may have the needed data. That office can also provide assistance in the analysis
process. System Safety personnel may have already completed analyses of similar activities or may be
able to suggest the most productive areas for initial analysis.
COMMENTS: The data in databases has been acquired the hard way - through the painful and costly
mistakes of hundreds of individuals. By taking full advantage of this information the analysis process can
be more realistic, efficient, and thorough and thereby preventing the same accidents (incidents?) from
occurring over and over again.
1.2.5 THE INTERVIEW TOOL
FORMAL NAME: The Interview Tool
ALTERNATIVE NAMES: None
PURPOSE: Often the most knowledgeable personnel in the area of risk are those who operate the
system. They see the problems and often think about potential solutions. The purpose of the Interview
Tool is to capture the experience of these personnel in ways that are efficient and positive for them.
Properly implemented, the Interview Tool can be among the most valuable hazard identification tools.
APPLICATION: Every organization can use the Interview Tool in one form or another.
METHOD: The Interview Tool¡¯s great strength is versatility. Figure 1.2.5A illustrates the many options
available to collect interview data. Key to all of these is to create a situation in which interviewees feel
free to honestly report what they know, without fear of any adverse consequences. This means absolute
confidentiality must be assured, by not using names in connection with data.
Figure 1.2.5A Interview Tool Alternatives
Direct interviews with operational personnel
Supervisors interview their subordinates and report results
Questionnaire interviews are completed and returns
Group interview sessions (several personnel at one time)
Hazards reported formally
Coworkers interview each other
FAA System Safety Handbook, Appendix F
December 30, 2000
F-27
RESOURCES: It is possible to operate the interview process facility-wide with the data being supplied
to individual units. Hazard interviews can also be integrated into other interview activities. For example,
counseling sessions could include a hazard interview segment. In these ways, the expertise and resource
demands of the Interview Tool can be minimized.
COMMENTS: The key source of risk is human error. Of all the hazard identification tools, the Interview Tool is potentially
the most effective at capturing human error data.
EXAMPLES: Figure 1.2.5B illustrates several variations of the Interview Tool.
Figure 1.2.5B Example Exit Interview Format
Name (optional)_____________________________ Organization _____________________
1. Describe below incidents, near misses or close calls that you have experienced or seen since
you have been in this organization. State the location and nature (i.e. what happened and why)
of the incident. If you can¡¯t think of an incident, then describe two hazards you have observed.
INCIDENT 1: Location: _____________________________________________________
What happened and why?______________________________________________________
__________________________________________________________________________
INCIDENT 2: Location: _____________________________________________________
What happened and why?______________________________________________________
__________________________________________________________________________
2. What do you think other personnel can do to eliminate these problems?
Personnel: _________________________________________________________________
Incident 1__________________________________________________________________
Incident 2__________________________________________________________________
Supervisors: _______________________________________________________________
Incident 1__________________________________________________________________
Incident 2__________________________________________________________________
Top Leadership: ___________________________________________________________
Incident 1__________________________________________________________________
Incident 2__________________________________________________________________
FAA System Safety Handbook, Appendix F
December 30, 2000
F-28
1.2.6 THE INSPECTION TOOL
FORMAL NAME: The Inspection Tool
ALTERNATIVE NAMES: The survey tool
PURPOSE: Inspections have two primary purposes. (1) The detection of hazards. Inspections
accomplish this through the direct observation of operations. The process is aided by the existence of
detailed standards against which operations can be compared. The OSHA standards and various national
standards organizations provide good examples. (2) To evaluate the degree of compliance with
established risk controls. When inspections are targeted at management and safety management
processes, they are usually called surveys. These surveys assess the effectiveness of management
procedures by evaluating status against some survey criteria or standard. Inspections are also important
as accountability tools and can be turned into important training opportunities
APPLICATION: Inspections and surveys are used in the risk management process in much the same
manner as in traditional safety programs. Where the traditional approach may require that all facilities are
inspected on the same frequency schedule, the ORM concept might dictate that high-risk activities be
inspected ten times or more frequently than lower risk operations, and that some of the lowest risk
operations be inspected once every five years or so. The degree of risk drives the frequency and depth of
the inspections and surveys.
METHOD: There are many methods of conducting inspections. From a risk management point of view
the key is focusing upon what will be inspected. The first step in effective inspections is the selection of
inspection criteria and the development of a checklist or protocol. This must be risk-based. Commercial
protocols are available that contain criteria validated to be connected with safety excellence.
Alternatively, excellent criteria can be developed using incident databases and the results of other hazard
identification tools such as the Operations Analysis and Logic Diagrams, etc. Some these have been
computerized to facilitate entry and processing of data. Once criteria are developed, a schedule is created
and inspections are begun. The inspection itself must be as positive an experience as possible for the
people whose activity is being inspected. Personnel performing inspections should be carefully trained,
not only in the technical processes involved, but also in human relations. During inspections, the ORM
concept encourages another departure from traditional inspection practices. This makes it possible to
evaluate the trend in organization performance by calculating the percentage of unsafe (non-standard)
versus safe (meet or exceed standard) observations. Once the observations are made the data must be
carefully entered in the overall hazard inventory database. Once in the database the data can be analyzed
as part of the overall body of data or as a mini-database composed of inspection findings only.
RESOURCES: There are many inspection criteria, checklists and related job aids available
commercially. Many have been tailored for specific types of organizations and activities. The System
Safety Office can be a valuable resource in the development of criteria and can provide technical support
in the form of interpretations, procedural guidance, and correlation of data.
COMMENTS: Inspections and surveys have long track records of success in detecting hazards and
reducing risk. However, they have been criticized as being inconsistent with modern management practice
because they are a form of ¡°downstream¡± quality control. By the time a hazard is detected by an
inspection, it may already have caused loss. The ORM approach to inspections emphasizes focus on the
FAA System Safety Handbook, Appendix F
December 30, 2000
F-29
higher risks within the organization and emphasizes the use of management and safety program surveys
that detect the underlying causes of hazards, rather than the hazards themselves.
EXAMPLES: Conventional inspections normally involve seeking and recording unsafe acts or
conditions. The number of these may reflect either the number of unsafe acts or conditions occurring in
the organization or the extent of the effort extended to find hazards. Thus, conventional inspections are
not a reliable indicator of the extent of risk. To change the nature of the process, it is often only necessary
to record the total number of observations made of key behaviors, then determine the number of unsafe
behaviors. This yields a rate of ¡°unsafeness¡± that is independent of the number of observations made.
1.2.7 THE JOB HAZARD ANALYSIS
FORMAL NAME: The Job Hazard Analysis
ALTERNATIVE NAMES: The task analysis, job safety analysis, JHA, JSA
PURPOSE: The purpose of the Job Hazard Analysis (JHA) is to examine in detail the safety
considerations of a single job. A variation of the JHA called a task analysis focuses on a single task, i.e.,
some smaller segment of a ¡°job.¡±
APPLICATION: Some organizations have established the goal of completing a JHA on every job in the
organization. If this can be accomplished cost effectively, it is worthwhile. Certainly, the higher risk jobs
in an organization warrant application of the JHA procedure. Within the risk management approach, it is
important that such a plan be accomplished by beginning with the most significant risk areas first.
The JHA is best accomplished using an outline similar to the one illustrated at Figure 1.2.7A. As shown
in the illustration, the job is broken down into its individual steps. Jobs that involve many quite different
tasks should be handled by analyzing each major task separately. The illustration considers risks both to
the workers involved, and to the system, as well as. Risk controls for both. Tools such as the Scenario
and "What If" tools can contribute to the identification of potential hazards. There are two alternative
ways to accomplish the JHA process. A safety professional can complete the process by asking questions
of the workers and supervisors involved. Alternatively, supervisors could be trained in the JHA process
and directed to analyze the jobs they supervise.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-30
Figure 1.2.7A Sample Job Hazard Analysis Format
Job Safety Analysis Job Title or Operation Page of ISA Number
Job Series/AFSC Supervisor
Organization Symbol Location/Building
Number
Shop Title Reviewed By
Required and/or Recommended Personal Protective Equipment Approved By
SEQUENCE OF BASIC
JOB STPES
POTENTIAL HAZARDS USAFE
ACTS OR CONDITIONS
RECOMMENDED ACTION
OR PROCEDURE
RESOURCES: The System Safety Office has personnel trained in detail in the JHA process who can
serve as consultants, and may have videos that walk a person through the process.
COMMENTS: The JHA is risk management. The concept of completing in-depth hazard assessments of
all jobs involving significant risk with the active participation of the personnel doing the work is an ideal
model of ORM in action.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-31
1.2.8 THE OPPORTUNITY ASSESSMENT
FORMAL NAME: The Opportunity Assessment
ALTERNATIVE NAMES: The opportunity-risk tool
PURPOSE: The Opportunity Assessment is intended to identify opportunities to expand the capabilities
of the organization and/or to significantly reduce the operational cost of risk control procedures. Either of
these possibilities means expanded capabilities.
APPLICATION: Organizations should systematically assess their capabilities on a regular basis,
especially in critical areas. The Opportunity Assessment can be one of the most useful tools in this
process and therefore should be completed on all-important operations and then be periodically updated.
METHOD: The Opportunity Assessment involves five key steps as outlined at Figure 1.2.10A. In Step
1, operational areas that would benefit substantially from expanded capabilities are identified and
prioritized. Additionally, areas where risk controls are consuming extensive resources or are otherwise
constraining operation capabilities are listed and prioritized. Step 2 involves the analysis of the specific
risk-related barriers that are limiting the desired expanded performance or causing the significant expense.
This is a critical step. Only by identifying the risk issues precisely can focused effort be brought to bear
to overcome them. Step 3 attacks the barriers by using the risk management process. This normally
involves reassessment of the hazards, application of improved risk controls, improved implementation of
existing controls, or a combination of these options. Step 4 is used when available risk management
procedures don¡¯t appear to offer any breakthrough possibilities. In these cases the organization must seek
out new
ORM tools using benchmarking procedures or, if necessary, innovate new procedures. Step 5 involves the
exploitation of any breakthroughs achieved by pushing the operational limits or cost saving until a new
barrier is reached. The cycle then repeats and a process of continuous improvement begins.
Figure 1.2.9A Opportunity Analysis Steps
RESOURCES: The Opportunity Assessment depends upon a detailed understanding of operational
processes so that barriers can be identified. An effective Opportunity Assessment will necessarily involve
operations experts.
Step 1. Review key operations to identify opportunities for enhancement. Prioritize.
Step 2. In areas where opportunities exist, analyze for risk barriers.
Step 3. When barriers are found, apply the ORM process.
Step 4. When available ORM processes can¡¯t breakthrough, innovate!
Step 5. When a barrier is breached, push through until a new barrier is reached.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-32
1.3 THE ADVANCED HAZARD IDENTIFICATION TOOLS
The five tools that follow are advanced hazard identification tools designed to support strategic hazard
analysis of higher risk and critical operations. These advanced tools are often essential when in-depth
hazard identification is needed. They provide the mechanism needed to push the limits of current hazard
identification technology. For example, the Management Oversight and Risk Tree (MORT) represents the
full-time efforts of dozens of experts over decades to fully develop an understanding of all of the sources
of hazards.
As might be expected, these tools are complex and require significant training to use. Full proficiency
also requires experience in using them. They are best reserved for use by, loss control professionals.
Those with an engineering, scientific, or other technical background are certainly capable of using these
tools with a little read-in. Even though professionals use the tools, much of the data that must be fed into
the procedures must come from operators.
In an organization with a mature ORM culture, all personnel in the organization will be aware that higher
risk justifies more extensive hazard identification. They will feel comfortable calling for help from loss
control professionals, knowing that these individuals have the advanced tools needed to cope with the
most serious situations. These advanced tools will play a key role in the mature ORM culture in helping
the organization reach its hazard identification goal: No significant hazard undetected.
1.3.1 THE ENERGY TRACE AND BARRIER ANALYSIS
FORMAL NAME: The Energy Trace and Barrier Analysis
ALTERNATIVE NAMES: Abnormal energy exchange
PURPOSE: The Energy Trace and Barrier Analysis (ETBA) is a procedure intended to detect hazards
by focusing in detail on the presence of energy in a system and the barriers for controlling that energy. It
is conceptually similar to the Interface Analysis in its focus on energy forms, but is considerably more
thorough and systematic.
APPLICATION: The ETBA is intended for use by loss system safety professionals and is targeted
against higher risk operations, especially those involving large amounts of energy or a wide variety of
energy types. The method is used extensively in the acquisition of new systems and other complex
systems.
METHOD: The ETBA involves 5 basic steps as shown at Figure 1.3.1A.
Step 1 is the identification of the types of energy found in the system. It often requires considerable
expertise to detect the presence of the types of energy listed at Figure 1.3.1B.
Step 2 is the trace step. Once identified as present, the point of origin of a particular type of energy must
be determined and then the flow of that energy through the system must be traced.
In Step 3 the barriers to the unwanted release of that energy must be analyzed. For example, electrical
energy is usually moved in wires with an insulated covering.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-33
In Step 4 the risk of barrier failure and the unwanted release of the energy are assessed. Finally, in Step 5,
risk control options are considered and selected.
Figure 1.3.1A ETBA Steps
Figure 1.3.1B Types of Energy
RESOURCES: This tool requires sophisticated understanding of the technical characteristics of systems
and of the various energy types and barriers. Availability of a safety professional, especially a safety
engineer or other professional engineer is important.
COMMENTS: Most accidents involve the unwanted release of one kind of energy or another. This fact
makes the ETBA a powerful hazard identification tool. When the risk stakes are high and the system is
complex, the ETBA is a must have.
EXAMPLES: A simplified example of the ETBA procedure is provided at Figure 1.3.
Step 1. Identify the types of energy present in the system
Step 2. Locate energy origin and trace the flow
Step 3. Identify and evaluate barriers (mechanisms to confine the energy)
Step 4. Determine the risk (the potential for hazardous energy to escape control and damage
something significant)
Step 5. Develop improved controls and implement as appropriate
Electrical
Kinetic (moving mass e.g. a vehicle, a machine part, a bullet)
Potential (not moving mass e.g. a heavy object suspended overhead)
Chemical (e.g. explosives, corrosive materials)
Noise and Vibration
Thermal (heat)
Radiation (Non-ionizing e.g. microwave, and ionizing e.g. nuclear radiation, x-rays)
Pressure (air, Hydraulic, water)
FAA System Safety Handbook, Appendix F
December 30, 2000
F-34
Figure 1.3.1C Example ETBA
Scenario: The supervisor of a maintenance facility has just investigated a serious incident
involving one of his personnel who received a serious shock while using a portable power drill in
the maintenance area. The tool involved used a standard three-prong plug. Investigation revealed
that the tool and the receptacle were both functioning properly. The individual was shocked when
he was holding the tool and made contact with a piece of metal electrical conduit (it one his drill
was plugged into) that had become energized as a result of an internal fault. As a result the
current flowed through the individual to the tool and through the grounded tool to ground resulting
in the severe shock. The supervisor decides to fully assess the control of electrical energy in this
area.
Option 1. Three prong tool. Electrical energy flow that is from the source through an insulated
wire, to the tool, to a single insulated electric motor. In the event of an internal fault the flow is
from the case of the tool through the ground wire to ground through the grounded third prong
through a properly grounded receptacle.
Hazards: Receptacle not properly grounded, third prong removed, person provides lower path of
resistance, break in any of the ground paths (case, cord, plug, and receptacle). These hazards are
serious in terms of the frequency encountered in the work environment and might be expected to
be present in 10% or more cases.
Option 2. Double insulated tool. The tool is not grounded. Protection that is provided by double
insulating the complete flow of electrical energy at all points in the tool. In the event of an internal
fault, there are two layers of insulation protection between the fault and the person preventing
shorting through the user.
Hazards: If the double layers of insulation are damaged as a result of extended use, rough
handling, or repair/maintenance activity, the double insulation barrier can be compromised. In the
absence of a fully effective tool inspection and replacement
program such damage is not an unusual situation.
Option 3. Grand Fault Circuit Fault Interrupters. Either of the above types of tools is used
(double insulated is preferred). Electrical energy flows as described above in both the normal and
fault situations. However, in the event of a fault (or any other cause of a differential between the
potential of a circuit), it is detected almost instantly and the circuit is opened preventing the flow
of dangerous amounts of current. Because no dangerous amount of current can flow the individual
using the tool is in no danger of shock. Circuit interrupters are reliable at a level of 1 in 10,000 or
higher and when they do fail, most failure modes are in the fail-safe mode. Ground Fault circuit
fault interrupters are inexpensive to purchase and relatively easy to install. In this case, the best
option is very likely to be the use of the circuit interrupter in connection with either Option 1 or 2,
with 2 the preferred. This combination for all practical purposes eliminates the possibility of
electric shock and injury/death as a result of using portable power tools.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-35
1.3.2 THE FAULT TREE ANALYSIS
FORMAL NAME: The Fault Tree Analysis
ALTERNATIVE NAMES: The logic tree
PURPOSE: The Fault Tree Analysis (FTA) is a hazard identification tool based on the negative type
Logic Diagram. The FTA adds several dimensions to the basic logic tree. The most important of these
additions are the use of symbols to add information to the trees and the possibility of adding quantitative
risk data to the diagrams. With these additions, the FTA adds substantial hazard identification value to
the basic Logic Diagram previously discussed.
APPLICATION: Because of its relative complexity and detail, it is normally not cost effective to use the
FTA against risks assessed below the level of extremely high or high. The method is used extensively in
the acquisition of new systems and other complex systems where, due to the complexity and criticality of
the system, the tool is a must.
METHOD: The FTA is constructed exactly like a negative Logic Diagram except that the symbols
depicted in Figure 1.3.2A are used.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-36
Figure 1.3.2A Key Fault Tree Analysis Symbols
The output event. Identification of a particular event in the sequence of an operation.
A basic event.
.
An event, usually a malfunction, for which further causes are not normally sought.
A normal event. An event in an operational sequence that is within expected performance standards
.
An ¡°AND¡± gate. Requires all of the below connected events to occur before the above connected event can occur
.
An ¡°OR¡± gate. Any one of the events can independently cause the event placed above the OR gate
.
An undeveloped event. This is an event not developed because of lack of information or the event lacks significance.
Transfer symbols. These symbols transfer the user to another part of the diagram. These symbols are used to
eliminate the need to repeat identical analyses that have been completed in connection with another part
of the fault tree.
RESOURCES: The System Safety Office is the best source of information regarding Fault Tree
Analysis. Like the other advanced tools, the FTA will involve the consultation of a safety professional or
engineer trained in the use of the tool. If the probabilistic aspects are added, it will also require a database
capable of supplying the detailed data needed.
COMMENTS: The FTA is one of the few hazard identification procedures that will support
quantification when the necessary data resources are available.
EXAMPLE: A brief example of the FTA is provided at Figure 1.3.2B. It illustrates how an event may be
traced to specific causes that can be very precisely identified at the lowest levels.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-37
Figure 1.3.2B Example of Fault Tree Analysis
Fire Occurs in
Storeroom
Combustibles
stored in
storeroom
Ignition source
In storeroom
Stock Material
Degrades to
Combustible State
Electrical Spark
Occurs
Direct Thermal
Energy Present
Radiant Thermal
Energy Raises
Combustibles
Leak into
Storeroom
Combustibles
Stored in
Storeroom
Airflow
< Critical
Valve
And
Or
Or
1.3.3 THE FAILURE MODES AND EFFECTS ANALYSIS
FORMAL NAME: The Failure Modes and Effects Analysis
ALTERNATIVE NAMES: The FMEA
PURPOSE: The Failure Modes and Effects Analysis (FMEA) is designed to evaluate the impact due to
the failure of various system components. A brief example of FMEA illustrating this purpose is the
analysis of the impact of the failure of the communications component (radio, landline, computer, etc.) of
a system on the overall operation. The focus of the FMEA is on how such a failure could occur (failure
mode) and the impact of such a failure (effects).
APPLICATION: The FMEA is generally regarded as a reliability tool but most operational personnel
can use the tool effectively. The FMEA can be thought of as a more detailed ¡°What If¡± analysis. It is
especially useful in contingency planning, where it is used to evaluate the impact of various possible
failures (contingencies). The FMEA can be used in place of the "What If" analysis when greater detail is
needed or it can be used to examine the impact of hazards developed using the "What If" tool in much
greater detail.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-38
METHOD: The FMEA uses a worksheet similar to the one illustrated at Figure 1.3.3A. As noted on the
sample worksheet, a specific component of the system to be analyzed is identified. Several components
can be analyzed. For example, a rotating part might freeze up, explode, breakup, slow down, or even
reverse direction. Each of these failure modes may have differing impacts on connected components and
the overall system. The worksheet calls for an assessment of the probability of each identified failure
mode.
Figure 1.3.3A Sample Failure Mode sand Effects Analysis Worksheet
FAILURE MODES AND EFFECTS ANALYSIS
Page ___of ___Pages
System_________________________ Date_______________
Subsystem _____________________ Analyst_____________
Component
Description
Failure
Mode
Effects on
Other Components
Effects
On
System
RAC or
Hazard
Category
Failure Frequency
Effects
Probability
Remarks
RESOURCES: The best source of more detailed information on the FMEA is the System Safety Office.
EXAMPLES: An example of the FMEA is provided at Figure 1.3.3B.
Figure 1.3.3B Example FMEA
Situation: The manager of a major facility is concerned about the possible impact of the failure of the
landline communications system that provides the sole communications capability at the site. The
decision is made to do a Failure Modes and Effects Analysis. An extract from the resulting FMEA is
shown below.
Component Function
Failure
Mode
& Cause
Failure
Effect on
Higher Item
System Probability
Corrective
Action
Landline
Wire
Comm Cut-natural cause,
falling tree, etc.
Comm system
down
Cease
Fire
Probable Clear natural obstacle
from around wires
FAA System Safety Handbook, Appendix F
December 30, 2000
F-39
Wire
Cut-unrelated
operational
activities
Comm system
down
Cease
Fire
Probable Warn all operations
placement of wire
Wire Line failure
Comm system
down
Cease
Fire
Probable Placement of wires
Proper grounding
Wire
Cut ¨C vandals &
thieves
Comm system
down
Cease
Fire
Unlikely Placement of wires
Area security
1.3.4 THE MULTI-LINEAR EVENTS SEQUENCING TOOL
FORMAL NAME: The Multi-linear Events Sequencing Tool
ALTERNATIVE NAMES: The timeline tool, the sequential time event plot (STEP)
2
PURPOSE: The Multi-linear Events Sequencing Tool (MES) is a specialized hazard identification
procedure designed to detect hazards arising from the time relationship of various operational activities.
The MES detects situations in which either the absolute or relative timing of events may create risk. For
example, an operational planner may have crammed too many events into a single period of time, creating
a task overload problem for the personnel involved. Alternatively, the MES may reveal that two or more
events in an operational plan conflict because a person or piece of equipment is required for both but
obviously cannot be in two places at once. The MES can be used as a hazard identification tool or as an
incident investigation tool.
APPLICATION: The MES is usually considered a loss prevention method, but the MES worksheet
simplifies the process to the point that a motivated individual can effectively use it. The MES should be
used any time that risk levels are significant and when timing and/or time relationships may be a source of
risk. It is an essential tool when the time relationships are relatively complex.
METHOD: The MES uses a worksheet similar to the one illustrated at Figure 4.1. The sample
worksheet displays the timeline of the operation across the top and the ¡°actors¡± (people or things) down
the left side. The flow of events is displayed on the worksheet, showing the relationship between the
actors on a time basis. Once the operation is displayed on the worksheet, the sources of risk will be
evident as the flow is examined.
2
K. Hendrisk, and L. Benner, Investigating Accidents with Step, Marcel Dekker, New York, 1988.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-40
Figure 1.3.4A Multi-linear Events Sequencing Form
(Time units in seconds or minutes as needed)
Actors
Timeline
(People or things
involved in the
process)
RESOURCES: The best sources for more detailed information on the MES is the System Safety staff.
As with the other advanced tools, using the MES will normally involve consultation with a safety
professional familiar with its application.
COMMENTS: The MES is unique in its role of examining the time-risk implications of operations.
1.3.5 THE MANAGEMENT OVERSIGHT AND RISK TREE
FORMAL NAME: The Management Oversight and Risk Tree
ALTERNATIVE NAMES: The MORT
PURPOSE: The Management Oversight and Risk Tree (MORT) uses a series of charts developed and
perfected over several years by the Department of Energy in connection with their nuclear safety
programs. Each chart identifies a potential operating or management level hazard that might be present in
an operation. The attention to detail characteristic of MORT is illustrated by the fact that the full MORT
diagram or tree contains more than 10,000 blocks. Even the simplest MORT chart contains over 300
blocks. The full application of MORT is a time-consuming and costly venture. The basic MORT chart
with about 300 blocks can be routinely used as a check on the other hazard identification tools. By
reviewing the major headings of the MORT chart, an analyst will often be reminded of a type of hazard
that was overlooked in the initial analysis. The MORT diagram is also very effective in assuring attention
to the underlying management root causes of hazards.
APPLICATION: Full application of MORT is reserved for the highest risks and most operation-critical
activities because of the time and expense required. MORT generally requires a specially trained loss
control professional to assure proper application.
METHOD: MORT is accomplished using the MORT diagrams, of which there are several levels
available. The most comprehensive, with about 10,000 blocks, fills a book. There is an intermediate
diagram with about 1500 blocks, and a basic diagram with about 300. It is possible to tailor a MORT
diagram by choosing various branches of the tree and using only those segments. The MORT is
essentially a negative tree, so the process begins by placing an undesired loss event at the top of the
FAA System Safety Handbook, Appendix F
December 30, 2000
F-41
diagram used. The user then systematically responds to the issues posed by the diagram. All aspects of
the diagram are considered and the ¡°less than adequate¡± blocks are highlighted for risk control action.
RESOURCES: The best source of information on MORT is the System Safety Office.
COMMENTS: The MORT diagram is an elaborate negative Logic Diagram. The difference is primarily
that the MORT diagram is already filled out for the user, allowing a person to identify the contributory
factors for a given undesirable event. Since the MORT is very detailed, as mentioned above, a person can
identify basic causes for essentially any type of event.
EXAMPLES: The top blocks of the MORT diagram are displayed at Figure 1.3.5A.
Figure 1.3.5A Example MORT Section
Accidental
Losses
Oversights &
Omissions
Assumed
Risk
Operational System
Factors LTA
Management System
Factors LTA
2.0 RISK ASSESSMENT TOOLS, DETAILS, AND EXAMPLES
Introduction. This section contains an example of assessing risk, using a risk assessment matrix (Figure
2). The easiest way to understand the application of the matrix is to apply it. The reasoning used in
constructing the matrix in the example below is provided.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-42
Example. The example below demonstrates the application of the matrix to the risk associated with
moving a heavy piece of machinery.
Risk to be assessed: The risk of the machine falling over and injuring personnel.
Probability assessment: The following paragraphs illustrate the thinking process that might be followed in
developing the probability segment of the risk assessment:
Use previous experience and the database, if available. ¡°We moved a similar machine once before and
although it did not fall over, there were some close calls. This machine is not as easy to secure as that
machine and has a higher center of gravity and poses an even greater chance of falling. The base safety
office indicates that there was an accident about 18 months ago that involved a similar operation. An
individual received a broken leg in that case.¡±
Use the output of the hazard analysis process. ¡°Our hazard analysis shows that there are several steps in
the machine movement process where the machine is vulnerable to falling. Furthermore, there are several
different types of contributory hazards that could cause the machine to fall. Both these factors increase
the probability of falling.¡±
Consider expert opinion. ¡°My experienced manager feels that there is a real danger of the machine
falling¡±
Consider your own intuition and judgment. ¡°My gut feeling is that there is a real possibility we could lose
control of this machine and topple it. The fact that we rarely move machines quite like this one increases
the probability of trouble.¡±
Refer to the matrix terms. ¡°Hmmm, the decision seems to be between likely and occasional. I understand
likely to mean that the machine is likely to fall, meaning a pretty high probability. Certainly there is a real
chance it may fall, but if we are careful, there should be no problem. I am going to select Occasional as
the best option from the matrix.¡±
Severity assessment. The following illustrates the thinking process that might occur in selecting the
severity portion of the risk assessment matrix for the machine falling risk:
Identify likely outcomes. ¡°If the machine falls, it will crush whatever it lands on. Such an injury will
almost certainly be severe. Because of the height of the machine, it can easily fall on a person¡¯s head and
body with almost certain fatal results. There are also a variety of different crushing injuries, especially of
the feet, even if the machine falls only a short distance.
Identify the most likely outcomes. ¡°Because of the weight of the machine, a severe injury is almost
certain. Because people are fairly agile and the fact that the falling machine gives a little warning that it is
falling, death is not likely.¡±
Consider factors other than injuries. ¡°We identified several equipment and facility items at risk. Most of
these we have guarded, but some are still vulnerable. If the machine falls nobody can do any thing to
protect these items. It would take a couple of days at least to get us back in full production.¡±
FAA System Safety Handbook, Appendix F
December 30, 2000
F-43
Refer to the matrix (see Figure 2.1A). ¡°Let¡¯s see, any injury is likely to be severe, but a fatality is not
very probable, property damage could be expensive and could cost us a lot of production time.
Considering both factors, I think that critical is the best choice.¡±
Combine probability and severity in the matrix. The thinking process should be as follows:
The probability category occasional is in the middle of the matrix (refer to the matrix below). I go down
until it meets the critical category coming from the left side. The result is a high rating. I notice that it is
among the lower high ratings but it is still high.¡±
Figure 2.1A Risk Assessment Matrix
Probability
Frequent Likely Occasional Seldom Unlikely
I
II
III
IV
Catastrophic
Critical
Moderate
Negligible
A B C D E
S
E
V
E
R
I
T
Y
Extremely
High High
Medium
Low
Medium
Extremely
High Risk Levels
Limitations and concerns with the use of the matrix. As you followed the scenario above, you may have
noted that there are some problems involved in using the matrix. These include the following:
Subjectivity. There are at least two dimensions of subjectivity involved in the use of the matrix. The first
is in the interpretation of the matrix categories. Your interpretation of the term ¡°critical¡± may be quite
different from mine. The second is in the interpretation of the risk. If a few weeks ago I saw a machine
much like the one to be moved fall over and crush a person to death, I might have a greater tendency to
rate both the probability and severity higher than someone who did not have such an experience. If time
and resources permit, averaging the rating of several can reduce this variation
personnel.
Inconsistency. The subjectivity described above naturally leads to some inconsistency. A risk rated very
high in one organization may only have a high rating in another. This becomes a real problem if the two
risks are competing for a limited pot of risk control resources (as they always are). There will be real
motivation to inflate risk assessments to enhance competitiveness for limited resources.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-44
3.0 RISK CONTROL OPTION ANALYSIS TOOLS, DETAILS, AND EXAMPLES
3.1 BASIC RISK CONTROL OPTIONS
Major risk control options and examples of each are as follows:
Reject a risk. We can and should refuse to take a risk if the overall costs of the risk exceed its benefits.
For example, planner may review the risks associated with a specific particular operation or task. After
assessing all the advantages and evaluating the increased risk associated with it, even after application of
all available risk controls, he decides the benefits do not outweigh the expected risk costs and it is better
off in the long run not doing the operation or task.
Avoiding risk altogether requires canceling or delaying the job, or operation, but is an option that is
rarely exercised due to operational importance. However, it may be possible to avoid specific risks: risks
associated with a night operation may be avoided by planning the operation for daytime, likewise
thunderstorms can be avoided by changing the route of flight.
Delaying a risk. It may be possible to delay a risk. If there is no time deadline or other operational benefit
to speedy accomplishment of a risky task, then it is often desirable delay the acceptance of the risk.
During the delay, the situation may change and the requirement to accept the risk may go away. During
the delay additional risk control options may become available for one reason or another (resources
become available, new technology becomes available, etc.) thereby reducing the overall risk.
Risk transference does not change probability or severity of the risk, but it may decrease the probability
or severity of the risk actually experienced by the individual or organization accomplishing the activity.
As a minimum, the risk to the original individual or organization is greatly decreased or eliminated
because the possible losses or costs are shifted to another entity.
Risk is commonly spread out by either increasing the exposure distance or by lengthening the time
between exposure events. Aircraft may be parked so that an explosion or fire in one aircraft will not
propagate to others. Risk may also be spread over a group of personnel by rotating the personnel involved
in a high-risk operation.
Compensate for a risk. We can create a redundant capability in certain special circumstances. Flight
control redundancy is an example of an engineering or design redundancy. Another example is to plan for
a back up, and then when a critical piece of equipment or other asset is damaged or destroyed we have
capabilities available to bring on line to continue the operation.
Risk can be reduced. The overall goal of risk management is to plan operations or design systems that do
not contain hazards and risks. However, the nature of most complex operations and systems makes it
impossible or impractical to design them completely risk-free. As hazard analyses are performed, hazards
will be identified that will require resolution. To be effective, risk management strategies must address the
components of risk: probability, severity, or exposure. A proven order of precedence for dealing with
risks and reducing the resulting risks is:
FAA System Safety Handbook, Appendix F
December 30, 2000
F-45
Plan or Design for Minimum Risk. From the first, plan the operation or design the system to eliminate risks.
Without hazards there is no probability, severity or exposure. If an identified risk cannot be eliminated,
reduce the associated risk to an acceptable level. Flight control components can be designed so they
cannot be incorrectly connected during maintenance operations as an example.
Incorporate Safety Devices. If identified hazards cannot be eliminated or their associated risk adequately
reduced by modifying the operation or system elements or their inputs, that risk should be reduced to an
acceptable level through the use of safety design features or devices. Safety devices can effect probability
and reduce severity: an automobile seat belt doesn¡¯t prevent a collision but reduces the severity of
injuries.
Provide Warning Devices. When planning, system design, and safety devices cannot effectively eliminate
identified hazards or adequately reduces associated risk, warning devices should be used to detect the
condition and alert personnel of the hazard. As an example, aircraft could be retrofitted with a low
altitude ground collision warning system to reduce controlled flight into the ground risks. Warning
signals and their application should be designed to minimize the probability of the incorrect personnel
reaction to the signals and should be standardized. Flashing red lights or sirens are a common warning
device that most people understand.
Develop Procedures and Training. Where it is impractical to eliminate hazards through design selection or
adequately reduce the associated risk with safety and warning devices, procedures and training should be
used. A warning system by itself may not be effective without training or procedures required to respond
to the hazardous condition. The greater the human contribution to the functioning of the system or
involvement in the operational process, the greater the chance for variability. However, if the system is
well designed and the operation well planned, the only remaining risk reduction strategies may be
procedures and training. Emergency procedure training and disaster preparedness exercises improve
human response to hazardous situations.
In most cases it will not be possible to eliminate safety risk entirely, but it will be possible to significantly
reduce it. There are many risk reduction options available. Examples are included in the next section.
3.1.1 THE RISK CONTROL OPTIONS MATRIX
The sample risk control options matrix, illustrated at Figure 3.1.1A, is designed to develop a detailed and
comprehensive list of risk control options. These options are listed in priority order of preference, all
things being equal, therefore start at the top and consider each option in turn. Add those controls that
appear suitable and practical to a list of potential options. Examples of control options for each are
suggested in Figure 3.1.1B. Many of the options may be applied at more than one level. For example, the
training option may be applied to operators, supervisors, more senior leaders, or staff personnel.
Figure 3.1.1A Sample Risk Control Options Matrix
OPTONS OPERATOR LEADER STAFF MGR
ENGINEER (Energy Mgt)
Limit Energy
Substitute Safer Form
Prevent Buildup
Prevent Release
Provide Slow Release
FAA System Safety Handbook, Appendix F
December 30, 2000
F-46
OPTONS OPERATOR LEADER STAFF MGR
Rechannel/separate In
Time/Space
Provide Special Maint of
Controls
GUARD
On Source
Barrier Between
On Human or Object
Raise Threshold (harden)
IMPROVE TASK DESIGN
Sequence of Events (Flow)
Timing (within tasks, between
tasks)
Human-Machine
Interface/Ergonomics
Simplify Tasks
Reduce Task Loads
(physical, mental, emotional)
Backout Options
LIMIT EXPOSURE
Number of People or Items
Time
Iterations
SELECTION OF PERSONNEL
Mental Criteria
Emotional Criteria
Physical Criteria
Experience
TRAIN AND EDUCATE
Core Tasks (especially critical
tasks)
Leader Tasks
Emergency/Contingency
Tasks
Safety Tasks
Rehearsals
WARN
Signs/Color Coding
Audio/Visual Alarms
Briefings
MOTIVATE
Measurable Standards
Essential Accountability
Positive/negative Incentives
Competition
Demonstrations of Effects
REDUCE EFFECTS
Emergency Equipment
Rescue Capabilities
FAA System Safety Handbook, Appendix F
December 30, 2000
F-47
OPTONS OPERATOR LEADER STAFF MGR
Emergency Medical Care
Emergency Procedures
Damage Control
Procedures/Plans
Backups/Redundant
Capabilities
REHABILITATE
Personnel
Facilities/equipment
Operational Capabilities
Figure 3.1.1B Example Risk Control Options Matrix
OPTIONS SOME EXAMPLES
ENGINEER (Energy Mgt.).
Limit Energy Lower voltages, small amount of explosives, reduce
heights, and reduce speeds
Substitute Safer Form Use air power, less hazardous chemicals, more stable
explosives/chemicals
Prevent Buildup Use automatic cutoffs, blowout panels, limit momentum,
governors
Prevent Release Containment, double/triple containment
Provide Slow Release Use pressure relief valves, energy absorbing materials
Rechannel/separate in
Time/Space
Automatic processes, deviators, barriers, distance
Provide Special Maint of
Controls
Special procedures, special checks/audits
GUARD.
On Source Fire suppression systems, energy absorbing systems (crash
walls, etc.)
Barrier between Revetments, walls, distance
On Human or Object Personal protective equipment, energy absorbing materials
Raise Threshold (harden) Acclimatization, over-design, reinforcement, physical
conditioning
IMPROVE TASK DESIGN.
Sequence of Events (Flow) Put tough tasks first before fatigue, don¡¯t schedule several
tough tasks in a row
Timing (within tasks,
between tasks)
Allow sufficient time to perform, to practice. Allow
adequate time between tasks
Man-Machine
Interface/Ergonomics
Assure equipment fits the people, and effective ergonomic
design
Simplify Tasks Provide job aids, reduce steps, provides tools like lifters
communications aids
FAA System Safety Handbook, Appendix F
December 30, 2000
F-48
OPTIONS SOME EXAMPLES
Reduce Task Loads
(physical, mental, emotional)
Set weight limits; automate mental calculations and some
monitoring tasks. Avoid excessive stress, provide breaks,
vacations, and spread risk among many
Bucket Options Establish points where process reversal is possible when
hazard is detected
LIMIT EXPOSURE.
Number of People or Items Only expose essential personnel & things
Time Minimize the time of exposure -Don¡¯t bring the explosives
until the last minute
Iterations Don¡¯t do it as often
SELECTION OF
PERSONNEL.
Mental Criteria Essential basic intelligence, and essential skills and
proficiency
Emotional Criteria Essential stability and maturity
Physical Criteria Essential strength, motor skills, endurance, size
Experience Demonstrated performance abilities
TRAIN AND EDUCATE.
Core Tasks (especially
critical tasks)
Define critical minimum abilities, train, test and score
Leader Tasks Define essential leader tasks and standards, train, test and
score
Emergency Contingency
Tasks
Define, assign, train, verify ability
Safety Tasks Hazard identification, risk controls, maintenance of
standards
Rehearsals Validate processes, validate skills, verify interfaces
WARN.
Signs/Color Coding Warning signs, instruction signs, traffic signs
Audio/Visual Alarms Bells, flares, flashing lights, klaxons, whistles
Briefings Refresher warnings, demonstrate hazards, refresh training
MOTIVATE.
Measurable Standards Define minimum acceptable risk controls, see that tasks
are assigned
Essential Accountability Check performance at an essential level of frequency and
detail
Positive/negative Incentives Meaningful individual & group rewards, punishment
Competition Healthy individual and group competition on a fair basis
Demonstrations of Effects Graphic, dynamic, but tasteful demonstrations of effects of
unsafe acts
REDUCE EFFECTS.
Emergency Equipment Fire extinguishers, first aid materials, spill containment
materials
Rescue Capabilities A rescue squad, rescue equipment, helicopter rescue
FAA System Safety Handbook, Appendix F
December 30, 2000
F-49
OPTIONS SOME EXAMPLES
Emergency Medical Care Trained first aid personnel, medical facilities
Emergency Damage Control
Procedures
Emergency responses for anticipated contingencies,
coordinating agencies
Backups/Redundant
Capabilities
Alternate ways to continue the operation if primaries are
lost
REHABILITATE.
Personnel Rehabilitation services restore confidence
Facilities/equipment Get key elements back in service
Operational Capabilities Focus on restoration of the operation
4.0 MAKE CONTROL DECISIONS TOOLS, DETAILS, AND EXAMPLES
Introduction. Making control decisions includes the basic options (reject, transfer, spread, etc.) as well as
a comprehensive list of risk reduction options generated through use of the risk control options matrix by
a decision-maker. The decision-making organization requires a procedure to establish, as a matter of
routine, who should make various levels of risk decisions. Finally, after the best available set of risk
controls is selected the decision-maker will make a final go/no-go decision.
Developing a decision-making process and system: Risk decision-making should be scrutinized in a risk
decision system.
This system will produce the following benefits:
¡¤ Promptly get decisions to the right decision-makers
¡¤ Create a trail of accountability
¡¤ Assure that risk decisions involving comparable levels of risk are generally made at comparable
levels of management
¡¤ Assure timely decisions
¡¤ Explicitly provide for the flexibility in the decision-making process required by the nature of
operations.
¡¤ A decision matrix is an important part of a good decision-making system. These are normally tied
directly to the risk assessment process.
Selecting the best combination of risk controls: This process can be made as simple as intuitively
choosing what appears to be the best control or group of controls, or so complex they justify the use of
the most sophisticated decision-making tools available. For most risks involving moderate levels of risk
and relatively small investments in risk controls, the intuitive method is fully satisfactory. Guidelines for
intuitive decisions are:
Don¡¯t select control options to produce the lowest level of risk, select the combination yielding the most
operational supportive level of risk. This means keeping in mind the need to take risks when those
appropriate risks are necessary for improved performance.
Be aware that some risk controls are incompatible. In some cases using risk control A will cancel the
effect of risk control B. Obviously using both A and B is wasting resources. For example, a fully
FAA System Safety Handbook, Appendix F
December 30, 2000
F-50
effective machine guard may make it completely unnecessary to use personnel protective equipment such
as goggles and face shields. Using both will waste resources and impose a burden on operators.
Be aware that some risk controls reinforce each other. For example, a strong enforcement program to
discipline violators of safety rules will be complemented by a positive incentive program to reward safe
performance. The impact of the two coordinated together will usually be stronger than the sum of their
impacts.
Evaluate full costs versus full benefits. Try to evaluate all the benefits of a risk and evaluate them against
all of the costs of the risk control package. Traditionally, this comparison has been limited to comparisons
of the incident/accident costs versus the safety function costs.
When it is supportive, choose redundant risk controls to protect against risk in-depth.
Keep in mind the objective is not risk control, it is optimum risk control.
Selecting risk controls when risks are high and risk control costs are important - cost benefit assessment.
In these cases, the stakes are high enough to justify application of more formal decision-making
processes. All of the tools existing in the management science of decision-making apply to the process of
risk decision-making. Two of these tools should be used routinely and deserve space in this publication.
The first is cost benefit assessment, a simplified variation of cost benefit analysis. Cost benefit analysis is
a science in itself, however, it can be simplified sufficiently for routine use in risk management decisionmaking even at the lowest organizational levels. Some fiscal accuracy will be lost in this process of
simplification, but the result of the application will be a much better selection of risk controls than if the
procedures were not used. Budget personnel are usually trained in these procedures and can add value to
the application. The process involves the following steps:
Step 1. Measure the full, lifecycle costs of the risk controls to include all costs to all involved parties. For
example, a motorcycle helmet standard should account for the fact that each operator will need to pay for
a helmet.
Step 2. Develop the best possible estimate of the likely lifecycle benefits of the risk control package to
include any non-safety benefits expressed as a dollar estimate. For example, an ergonomics program can
be expected to produce significant productivity benefits in addition to a reduction in cumulative trauma
injuries.
Step 3. Let your budget expert¡¯s fine-tune your efforts.
Step 4. Develop the cost benefit ratio. You are seeking the best possible benefit-to-cost ratio but at least 2
to 1.
Step 5. Fine-tune the risk control package to achieve an improved ¡°bang for the buck¡±. The example at
Figure 4.1A illustrates this process of fine-tuning applied to an ergonomics-training course (risk control).
FAA System Safety Handbook, Appendix F
December 30, 2000
F-51
Figure 4.1A Example Maximizing Bang for the Buck
Anyone can throw money at a problem. A manager finds the optimum level of resources producing
an optimum level of effectiveness, i.e. maximum bang for the buck. Consider an ergonomicstraining program involving training 400 supervisors from across the entire organization in a 4-hour
(3 hours training, 1-hour admin) ergonomics-training course that will cost $30,500 including
student time. Ergonomics losses have been averaging $300,000 per year and estimates are that the
risk control will reduce this loss by 10% or $30,000. On the basis of a cost benefit assessment over
the next year (ignoring any out year considerations), this risk control appears to have a one year
negative cost benefit ratio i.e. $30,000 in benefit, versus a $30,500 investment, a $500 loss.
Apparently it is not a sound investment on a one-year basis. This is particularly true when we
consider that most decision-makers will want the comfort of a 2 or 3 to 1 cost benefit ratio to insure
a positive outcome. Can this project be turned into a winner?
We can make it a winner if able to access risk information concerning ergonomics injuries/illnesses
from loss control office data, risk management concepts, and a useful tool called ¡°Pareto¡¯s Law¡±.
Pareto¡¯s Law, as previously mentioned, essentially states that 80% of most problems can be found
in 20% of the exposure. For example, 80% of all traffic accidents might involve only 20% of the
driver population. We can use this law, guided by our injury/illness data, to turn the training
program into a solid winner. Here is what we might do.
Step 1. Let¡¯s assume that Pareto¡¯s Law applies to the distribution of ergonomics problems within
this organization. If so, then 80% of the ergonomics problem can be found in 20% of our
exposures. Our data can tell us which 20%. We can then target the 20% (80 students) of the
original 400 students that are accounting for 80% of our ergonomics costs ($240,000).
Step 2. Lets also assume that Pareto¡¯s Law applies to the importance of tasks that we intend to
teach in the training course. If the three hours of training included 10 tasks, lets assume that two of
those tasks (20%) will in fact account for 80% of the benefit of the course. Again our data should
be able to indicate this. Lets also assume that by good luck, these two tasks only take the same time
to teach as the other eight. We might now decide to teach only these two tasks which will require
only 36 minutes (20% of 180 minutes). We will still retain 80% of the $240,000 target value or
$192,000.
Step 3. Since the training now only requires 36 minutes, we will modify our training procedure to
conduct the training in the workshops rather than in a classroom. This reduces our admin time from
1 hour (wash up, travel, get there well before it actually starts, and return to work) to 4 minutes.
Our total training time is now 40 minutes.
Summary. We are still targeting $192,000 of the original $300,000 annual loss but our cost factor
is now 80 employees for 40 minutes at $15/hour, with our teaching cost cut to 1/5th of the $6000
(80 students instead of 400) which is $1200. We still have our staff cost so the total cost of the
project is now $2500. We will still get the 10% reduction in the remaining $192,000 that we are
still targeting, which totals $19,200. Our cost benefit ratio is now a robust 7.68 to 1. If all goes
well with the initial training and we actually demonstrate at 20% loss reduction, we may choose to
expand the training to the next riskiest 20% of our 400 personnel which should also produce a very
positive return.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-52
Selecting risk controls when risks are high and risk control costs are important - use of decision matrices.
An excellent tool for evaluating various risk control options is the decision matrix. On the vertical
dimension of the matrix we list the operation supportive characteristics we are looking for in risk controls.
Across the top of the matrix we list the various risk control options (individual options or packages of
options). Then we rank each control option on a scale of 1 (very low) to 10 (very high) in each of the
desirable characteristics. If we choose to, we can weight each desirable characteristic based on its
operational significance and calculate the weighted score (illustrated below). All things being the same,
the options with the higher scores are the stronger options. A generic illustration is provided at Figure
4.1B.
Figure 4.1B Sample Decision Matrix
RATING
FACTOR WEIGHT* RISK CONTROL OPTIONS/PACKAGES
#1 #2 #3 #4 #5 #6
Low Cost
5 9/45 6/30 4/20 5/25 8/40 8/40
Easy to implement
4 10/40 7/28 5/20 6/24 8/32 8/32
Positive Operator
involvement 5 8/40 2/10 1/5 6/30 3/15 7/35
Consistent with
Culture 3 10/30 2/6 9/27 6/18 6/18 6/18
Easy to integrate
3 9/27 5/15 6/18 7/21 6/18 5/15
Easy to measure
2 10/20 10/20 10/20 8/16 8/16 5/10
Low risk (sure to
succeed) 3 9/27 9/27 10/30 2/6 4/12 5/15
TOTALS 229 136 140 140 151 165
* Weighting is optional and is designed to reflect the relative importance of the various
factors.
Summary. It is not unusual for a risk control package to cost hundreds of thousands of dollars and even
millions over time. Millions of dollars and critical operations may be at risk. The expenditure of several
tens of thousands of dollars to get the decision right is sound management practice and good risk
management.
5.0 RISK CONTROL IMPLEMENTATION TOOLS AND DETAILS
FAA System Safety Handbook, Appendix F
December 30, 2000
F-53
5.1 Introduction
Figure 5.1A summarizes a Risk Control Implementation model. It is based on accountability being an
essential element of risk management success. Organizations and individuals must be held accountable for
the risk decisions and actions that they take or the risk control motivation is minimized. The model
depicted at Figure 5.1A is the basis of positive accountability and strong risk control behavior.
Figure 5.1A Implementation Model
5.2 Applying the model
The example below illustrates each step in the model applied to the sometimes-difficult task of assuring
that personnel consistently wear and use their protective clothing and equipment. The steps of the model
should be applied as follows:
5.2.1 Identify key tasks
This step, while obvious however, is critical to actually define the key tasks with enough accuracy that
effective accountability is justified. For example, in our example regarding use of protective clothing and
equipment, it is essential to identify exactly when the use of such items is required. Is it when I enter the
door of a work area? When I approach a machine? How close? What about on the loading dock? Exactly
what items are to be worn? Is there any specific way that they should be worn? I can be wearing ear plugs
but incorrectly have them stuck in the outer ear, producing little or no noise reduction benefit. Does this
meet the requirement? The task needs to be defined with sufficient precision that personnel know what is
expected of them and that what is expected of them produces the risk control desired. It is also important
that the task be made as simple, pleasant, and trouble free as possible. In this way we significantly
increase the ease with which the rest of the process proceeds.
5.2.2 Assign key tasks
Personnel need to know clearly what is expected of them especially if they are going to be held
accountable for the task. This is normally not difficult. The task can be included in job descriptions,
operating instructions, or in the task procedures contained in manuals. It can be very effectively be
embedded in training. In less structured situations, it can be a clear verbal order or directive. It is
important that the assignment of the task include the specifics of what is expected.
5.2.3 Measure performance
The task needs to include at least a basic level of measurement. It is important to note that measurement
does not need to include every time the behavior is displayed. It is often perfectly practical to sample
performance only once in large number of actions, perhaps as few as one in several hundred actions as
long as the sample is a random example of routine behavior. Often the only one who needs to do the
measuring is the individual responsible for the behavior. In other situations, the supervisor or an outside
auditor may need to do the observing. Performance is compared to the standard, which should have been
ID Key
Tasks
Assign
Key Tasks
Measure
Performan
ce
Reward
Correct
Safe
Behavior
FAA System Safety Handbook, Appendix F
December 30, 2000
F-54
communicated to the responsible individual. This step of the process is the rigorous application of the old
adage that ¡°What is monitored (or measured) and checked gets done.¡±
5.2.4 Reward correct behavior and correct inadequate behavior
The emphasis should clearly be on reinforcing correct behavior. Reinforcement means any action that
increases the likelihood that the person will display the desired behavior again. It can be as informal as a
pat on the back or as formal as a major award or cash incentive. Correcting inadequate behavior should
be done whenever inadequate behavior is observed. The special case of punishment should only be used
when all other means of producing the desired behavior have failed.
5.2.5 Risk control performance
If the steps outlined above have been accomplished correctly, the result will be consistent success in
controlling risk. Note that and unpleasantness of the task will dictate the extent of the rewards and
corrective actions required. The harder the task for whatever reason, the more powerful the rewards and
corrective actions needed will be. It is important to make risk control tasks as uncomplicated, and
pleasant as possible.
6.0 SUPERVISE AND REVIEW DETAILS AND EXAMPLES
Management involves moving a task or an organization toward a goal. To move toward a goal you must
have three things. You must have a goal, you must know where you are in relation to that goal, and you
must have a plan to reach it. An effective set of risk matrices provides two of the elements.
In regard to ORM, indicators should provide information concerning the success or lack of success of
controls intended to mitigate a risk. These indicators could focus on those key areas identified during the
assessment as being critical to minimizing a serious risk area. Additionally, matrices may be developed to
generically identify operations/areas where ORM efforts are needed.
A representative set of risk measures that a maintenance shop leader could use to assess the progress of
his shop toward the goal of improving safety performance. Similar indicators could be developed in the
areas of environment, fire prevention, security, and other loss control areas.
The tool control effectiveness index. Establish key indicators of tool control program effectiveness
(percentage of tool checks completed, items found by QA, score on knowledge quiz regarding control
procedures, etc.). All that is needed is a sampling of data in one or more of these areas. If more than one
area is sampled, the scores can be weighted if desired and rolled up into a single tool control index by
averaging them. See Figure 6.1A for the example.
Figure 6.1A Example Tool Control Effectiveness Measurement
The percent of tool checks completed is 94%.
Items found by QA. Items were found in 2% of QA inspections (98% were to standard).
Tool control quiz score is 88%.
If all items are weighted equally (94+98+88 divided by 3 = 93.3) then 93.3 is this quarter¡¯s
tool control safety index. Of course, in this index, high scores are desirable.
FAA System Safety Handbook, Appendix F
December 30, 2000
F-55
The protective clothing and equipment risk index. Shop personnel are using this index measures the
effectiveness with which required protective clothing and equipment. Making spot observations
periodically during the workday collects data. Data are recorded on a check sheet and are rolled-up
monthly. The index is the percent safe observations of the total number of observations made as
illustrated at Figure 6.1B.
Figure 6.1B Example Safety Observation Measurement
The emergency procedures index. This index measures the readiness of the shop to respond to various
emergencies such as fires, injuries, and hazmat releases. It is made up of a compilation of indicators as
shown at Figure 6.1C A high score is desirable.
Figure 6.1C Example Emergency Procedures Measurement
The quality assurance score. This score measures a defined set of maintenance indicators tailored to the
particular type of aircraft serviced. Quality Assurance (QA) personnel record deviations in these target
areas as a percentage of total observations made. The specific types of deviations are noted. The score is
the percentage of positive observations with a high score being desirable. Secondary scores could be
developed for each type of deviation if desired.
The overall index. Any combination of the indicators previously mentioned, along with others as desired,
can be rolled up into an overall index for the maintenance facility as illustrated at Figure 6.1D.
Scores on emergency procedure quizzes
Percentage of emergency equipment on hand and fully operational
Scores on emergency response drills indicating speed, correct procedures, and other
effectiveness indicators
TOTAL OBSERVATIONS: 27 SAFE OBSERVATIONS: 21
The protective clothing and equipment safety index is 78 (21 divided by 27 = 78%).
In this index high scores are desirable
FAA System Safety Handbook, Appendix F
December 30, 2000
F-56
Figure 6.1D Example Overall Measurement
Once the data has been collected and analyzed, the results need to be provided to the unit. With this
information the unit will be able to concentrate their efforts on those areas where improvement would
produce the greatest gain.
Summary. It is not difficult to set up useful and effective measures of operational risk, particularly once
the key risks have been identified during a risk assessment. Additionally, the workload associated with
such indicators can be minimized by using data already collected and by collecting the data as an
integrated routine aspect of operational processes.
Tool control safety index: 93.3
Protective clothing and equipment safety index: 78.0
Emergency procedures index: 88.4
Quality Assurance Score: 97.9
TOTAL: 357.6
OR AVERAGE: 89.4
This index is the overall safety index for the maintenance facility. The goal is to push toward
100% or a maximum score of 400. This index would be used in our accountability procedures
to measure performance and establish the basis for rewards or corrective action.
Order 8040.4
______________________________________________________________________________Distribution: A-WXYZ-2; A-FOF-0 (Ltd) Initiated by: ASY300
Appendix G
FAA ORDER 8040.4
8040.4
6/26/98
Page 2
Par 5
ORDER
U.S. DEPARTMENT OF TRANSPORTATION
FEDERAL AVIATION ADMINISTRATION
8040.4
6/26/98
SUBJ: SAFETY RISK MANAGEMENT
1. PURPOSE. This order establishes the safety risk management policy and prescribes procedures for
implementing safety risk management as a decision making tool within the Federal Aviation Administration
(FAA). This order establishes the Safety Risk Management Committee.
2. DISTRIBUTION. This order is distributed to the division level in the Washington headquarters,
regions, and centers, with limited distribution to all field offices and facilities.
3. DEFINITIONS. Appendix 1, Definitions, contains definitions used in this order.
4. SCOPE. This order requires the application of a flexible but formalized safety risk management
process for all high-consequence decisions, except in situations deemed by the Administrator to be an
emergency. A high-consequence decision is one that either creates or could be reasonably estimated to
result in a statistical increase or decrease, as determined by the program office, in personal injuries and/or
loss of life and health, a change in property values, loss of or damage to property, costs or savings, or other
economic impacts valued at $100,000,000 or more per annum. The objective of this policy is to formalize a
common sense approach to risk management and safety risk analysis/assessment in FAA decisionmaking.
This order is not intended to interfere with regulatory processes and activities. Each program office will
interpret, establish, and execute the policy contained herein consistent with its role and responsibility. The
Safety Risk Management Committee will consist of technical personnel with risk assessment expertise and
be available for guidance across all FAA programs.
5. SAFETY RISK MANAGEMENT POLICY. The FAA shall use a formal, disciplined, and
documented decisionmaking process to address safety risks in relation to high-consequence decisions
impacting the complete product life cycle. The critical information resulting from a safety risk
management process can thereby be effectively communicated in an objective and unbiased manner to
decisionmakers, and from decisionmakers to the public. All decisionmaking authorities within the FAA
shall maintain safety risk management expertise appropriate to their operations, and shall perform and
document the safety risk management process prior to issuing the high-consequence decision. The choice
of methodologies to support risk management efforts remains the responsibility of each program office. The
decisionmaking authority shall determine the documentation format. The approach to safety risk
management is composed of the following steps:
a. Plan. A case-specific plan for risk analysis and risk assessment shall be predetermined in
adequate detail for appropriate review and agreement by the decisionmaking authority prior to commitment
of resources. The plan shall additionally describe criteria for acceptable risk.
6/26/98
8040.4
Par 6 Page 3
(and 4)
b. Hazard Identification. The specific safety hazard or list of hazards to be addressed by the safety
risk management plan shall be explicitly identified to prevent ambiguity in subsequent analysis and
assessment.
c. Analysis. Both elements of risk (hazard severity and likelihood of occurrence) shall be
characterized. The inability to quantify and/or lack of historical data on a particular hazard does not
exclude the hazard from this requirement. If the seriousness of a hazard can be expected to increase over
the effective life of the decision, this should be noted. Additionally, both elements should be estimated for
each hazard being analyzed, even if historical and/or quantitative data is not available.
d. Assessment. The combined impact of the risk elements in paragraph 5c shall be compared to
acceptability criteria and the results provided for decisionmaking.
e. Decision. The risk management decision shall consider the risk assessment results conducted in
accordance with paragraph 5d. Risk assessment results may be used to compare and contrast alternative
options.
6. PRINCIPLES FOR SAFETY RISK ASSESSMENT AND RISK CHARACTERIZATION. In
characterizing risk, one must comply with each of the following:
a. General. Safety risk assessments, to the maximum extent feasible:
(1) Are scientifically objective.
(2) Are unbiased.
(3) Include all relevant data available.
(4) Employ default or conservative assumptions only if situation-specific information is
not reasonably available. The basis of these assumptions must be clearly identified.
(5) Distinguish clearly as to what risks would be affected by the decision and what risks
would not.
(6) Are reasonably detailed and accurate.
(7) Relate to current risk or the risk resulting from not adopting the proposal being
considered.
(8) Allow for unknown and/or unquantifiable risks.
b. Principles. The principles to be applied when preparing safety risk assessments are:
(1) Each risk assessment should first analyze the two elements of risk: severity of the
hazard and likelihood of occurrence. Risk assessment is then performed by comparing the combined effect
of their characteristics to acceptable criteria as determined in the plan (paragraph 5a).
(2) A risk assessment may be qualitative and/or quantitative. To the maximum extent
practicable, these risk assessments will be quantitative.
8040.4
6/26/98
Page 2
Par 5
(3) The selection of a risk assessment methodology should be flexible.
(4) Basic assumptions should be documented or, if only bounds can be estimated reliably,
the range encompassed should be described.
(5) Significant risk assessment assumptions, inferences, or models should:
(a) Describe any model used in the risk assessment and make explicit the
assumptions incorporated in the model.
(b) Identify any policy or value judgments.
(c) Explain the basis for choices.
(d) Indicate the extent that the model and the assumptions incorporated have been
validated by or conflict with empirical data.
(6) All safety risk assessments should include or summarize the information of paragraphs
6a (3) and 6a(4) as well as 6b (4) and 6b (5). This record should be maintained by the organization
performing the assessment in accordance with Order 1350.15B, Records Organization, Transfer, and
Destruction Standards.
7. ANALYSIS OF RISK REDUCTION BENEFITS AND COSTS. For each high-consequence
decision, the following tasks shall be performed:
a. Compare the results of a risk assessment for each risk-reduction alternative considered,
including no action, in order to rank each risk assessment for decisionmaking purposes. The assessment
will consider future conditions, e.g., increased traffic volume.
b. Assess the costs and the safety risk reduction or other benefits associated with implementation
of, and compliance with, an alternative under final consideration.
8. SUBSTITUTION RISKS. Safety risk assessments of proposed changes to high-consequence
decisions shall include a statement of substitution risks. Substitution risks shall be included in the risk
assessment documentation.
9. SAFETY RISK MANAGEMENT COMMITTEE. This order establishes the Safety Risk
Management Committee. Appendix 2, Safety Risk Management Committee, contains the committee
charter. The committee shall provide a service to any FAA organization for safety risk management
planning, as outlined in appendix 2, when requested by the responsible program office. It also meets
periodically (e.g., two to four times per year) to exchange risk management ideas and information. The
committee will provide advice and counsel to the Office of System Safety, the Assistant Administrator for
System Safety, and other management officials when requested.
Jane F. Garvey
Administrator
8040.4
Appendix 1
Page 1 and 2
APPENDIX 1. DEFINITIONS.
1. COSTS. Direct and indirect costs to the United States Government, State, local, and tribal
governments, international trade impacts, and the private sector.
2. EMERGENCY. A circumstance that requires immediate action to be taken.
3. HAZARD. Condition, event, or circumstance that could lead to or contribute to an unplanned or
undesired event.
4. HAZARD IDENTIFICATION. Identification of a substance, activity, or condition as potentially
posing a risk to human health or safety.
5. HIGH-CONSEQUENCE DECISION. Decision that either creates or could be reasonably estimated to
result in a statistical increase or decrease in personal injuries and/or loss of life and health, a change in
property values, loss of or damage to property, costs or savings, or other economic impacts valued at
$100,000,000 or more per annum.
6. PRODUCT LIFE CYCLE. The entire sequence from precertification activities through those
associated with removal from service.
7. MISHAP. Unplanned event, or series of events, that results in death, injury, occupational illness, or
damage to or loss of equipment or property.
8. RISK. Expression of the impact of an undesired event in terms of event severity and event likelihood.
9. RISK ASSESSMENT.
a. Process of identifying hazards and quantifying or qualifying the degree of risk they pose for
exposed individuals, populations, or resources; and/or
b. Document containing the explanation of how the assessment process is applied to individual
activities or conditions.
10. RISK CHARACTERIZATION. Identification or evaluation of the two components of risk, i.e.,
undesired event severity and likelihood of occurrence.
11. RISK MANAGEMENT. Management activity ensuring that risk is identified and eliminated or
controlled within established program risk parameters.
12. SAFETY RISK. Expression of the probability and impact of an undesired event in terms of hazard
severity and hazard likelihood.
13. SUBSTITUTION RISK. Additional risk to human health or safety, to include property risk, from an
action designed to reduce some other risk(s).
6/26/98
8040.4
Appendix 2
Page 1
APPENDIX 2. SAFETY RISK MANAGEMENT COMMITTEE
1. PURPOSE. The Safety Risk Management Committee provides a communication and support team to
supplement the overall risk analysis capability and efficiency of key FAA organizations.
2. RESPONSIBILITIES. The Committee supports FAA safety risk management activities. It provides
advice and guidance, upon request from responsible program offices, to help them fulfill their authority and
responsibility to incorporate safety risk management as a decisionmaking tool. It serves as an internal
vehicle for risk management process communication, for coordination of risk analysis methods, and for use
of common practices where appropriate. This includes, but is not limited to:
a. Continuing the internal exchange of risk management information among key FAA
organizations.
b. Fostering the exchange of risk management ideas and information with other government
agencies and industry to avoid duplication of effort.
c. Providing risk analysis/management advice and guidance.
d. Identifying and recommending needed enhancements to FAA risk analysis/management
capabilities and/or efficiencies upon request.
e. Maintaining a risk management resources directory that includes:
(1) FAA risk methodologies productively employed,
(2) Specific internal risk analysis/management expertise by methodology or tool and
organizational contact point(s), and
(3) A central contact point for resource identification assistance.
f. Encouraging the establishment of an international directory of aviation safety information
resources via the Internet.
g. Assisting in the identification of suitable risk analysis tools and initiate appropriate training in
the use of these tools.
3. COMPOSITION. The Safety Risk Management Committee is composed of safety and risk
management professionals representing all Associate/Assistant Administrators and the Offices of the Chief
Counsel, Civil Rights, Government and Industry Affairs, and Public Affairs. The Assistant Administrator
for System Safety will designate an individual to chair the committee. The chairperson is responsible for
providing written notice of all meetings to committee members and, in coordination with the executive
secretary, keeping minutes of the meetings.
8040.4
6/26/98
Appendix 2
Page 2
4. ASSIGNMENTS. The Safety Risk Management Committee may form ad hoc working groups to
address specific issues when requested by the responsible program office. Composition of those working
groups will consist of member representatives from across the FAA. Working groups will be disbanded
upon completion of their task. The Office of System Safety shall provide the position of executive
secretary of the committee. The Office of System Safety shall also furnish other administrative support.
5. FUNDING. Resources for support staff and working group activities will be provided as determined
by the Assistant Administrator for System Safety. Unless otherwise stated, each member is responsible for
his/her own costs associated with committee membership.
Standard Practice for System Safety
APPENDIX HMIL-STD-882D
MIL-STD-882D
2
NOT MEASUREMENT
SENSITIVE
MIL-STD-882D
10 February 2000
SUPERSEDING
MIL-STD-882C
19 January 1993
DEPARTMENT OF DEFENSE
STANDARD PRACTICE FOR
SYSTEM SAFETY
AMSC N/A AREA SAFT
MIL-STD-882D
ii
FOREWORD
1. This standard is approved for use by all Departments and Agencies within the
Department of Defense (DoD).
2. The DoD is committed to protecting: private and public personnel from accidental
death, injury, or occupational illness; weapon systems, equipment, material, and facilities from
accidental destruction or damage; and public property while executing its mission of national
defense. Within mission requirements, the DoD will also ensure that the quality of the
environment is protected to the maximum extent practical. The DoD has implemented
environmental, safety, and health efforts to meet these objectives. Integral to these efforts is the
use of a system safety approach to manage the risk of mishaps associated with DoD operations.
A key objective of the DoD system safety approach is to include mishap risk management
consistent with mission requirements, in technology development by design for DoD systems,
subsystems, equipment, facilities, and their interfaces and operation. The DoD goal is zero
mishaps.
3. This standard practice addresses an approach (a standard practice normally identified
as system safety) useful in the management of environmental, safety, and health mishap risks
encountered in the development, test, production, use, and disposal of DoD systems, subsystems,
equipment, and facilities. The approach described herein conforms to the acquisition procedures
in DoD Regulation 5000.2-R and provides a consistent means of evaluating identified mishap
risks. Mishap risk must be identified, evaluated, and mitigated to a level acceptable (as defined
by the system user or customer) to the appropriate authority, and compliant with federal laws and
regulations, Executive Orders, treaties, and agreements. Program trade studies associated with
mitigating mishap risk must consider total life cycle cost in any decision. Residual mishap risk
associated with an individual system must be reported to and accepted by the appropriate
authority as defined in DoD Regulation 5000.2-R. When MIL-STD-882 is required in a
solicitation or contract and no specific references are included, then only those requirements
presented in section 4 are applicable.
4. This revision applies the tenets of acquisition reform to system safety in Government
procurement. A joint Government/Industrial process team oversaw this revision. The
Government Electronic and Information Technology Association (GEIA), G-48 committee on
system safety represented industry on the process action team. System safety information (e.g.,
system safety tasks, commonly used approaches, etc.) associated with previous versions of this
standard are in the Defense Acquisition Deskbook (see 6.8). This standard practice is no longer
the source for any safety-related data item descriptions (DIDs).
5. Address beneficial comments (recommendations, additions, and deletions) and any
pertinent information that may be of use in improving this document to: HQ Air Force Materiel
Command (SES), 4375 Chidlaw Road, Wright-Patterson AFB, OH 45433-5006. Use the
Standardization Document Improvement Proposal (DD Form 1426) appearing at the end of this
document or by letter or electronic mail.
MIL-STD-882D
iii
CONTENTS
PARAGRAPH PAGE
FOREWORD..................................................................................................................ii
1. SCOPE............................................................................................................................1
1.1 Scope...................................................................................................................1
2. APPLICABLE DOCUMENTS........................................................................................1
3. DEFINITIONS................................................................................................................1
3.1 Acronyms used in this standard ...........................................................................1
3.2 Definitions...........................................................................................................1
3.2.1 Acquisition program ............................................................................................1
3.2.2 Developer ............................................................................................................1
3.2.3 Hazard .................................................................................................................1
3.2.4 Hazardous material ..............................................................................................2
3.2.5 Life cycle.............................................................................................................2
3.2.6 Mishap.................................................................................................................2
3.2.7 Mishap risk..........................................................................................................2
3.2.8 Program manager.................................................................................................2
3.2.9 Residual mishap risk............................................................................................2
3.2.10 Safety ..................................................................................................................2
3.2.11 Subsystem ...........................................................................................................2
3.2.12 System.................................................................................................................2
3.2.13 System safety.......................................................................................................2
3.2.14 System safety engineering....................................................................................2
4. GENERAL REQUIREMENTS.......................................................................................3
4.1 Documentation of the system safety approach......................................................3
4.2 Identification of hazards.......................................................................................3
4.3 Assessment of mishap risk...................................................................................3
4.4 Identification of mishap risk mitigation measures ................................................3
4.5 Reduction of mishap risk to an acceptable level ...................................................4
4.6 Verification of mishap risk reduction ...................................................................4
4.7 Review of hazards and acceptance of residual mishap risk by the appropriate
authority ..............................................................................................................4
4.8 Tracking of hazards and residual mishap risk.......................................................4
5. DETAILED REQUIREMENTS......................................................................................4
6. NOTES ...........................................................................................................................5
6.1 Intended use.........................................................................................................5
6.2 Data requirements................................................................................................5
6.3 Subject term (key words) listing...........................................................................6
MIL-STD-882D
iv
6.4 Definitions used in this standard ..........................................................................6
6.5 International standardization agreements..............................................................6
6.6 Explosive hazard classification and characteristic data.........................................6
6.7 Use of system safety data in certification and other specialized safety approvals..6
6.8 DoD acquisition practices ....................................................................................7
6.9 Identification of changes......................................................................................7
APPENDIXES
A Guidance for implementation of system safety efforts..........................................8
CONCLUDING MATERIAL....................................................................................... 26
TABLES
TABLE PAGE
A-I. Suggested mishap severity categories................................................................. 18
A-II. Suggested mishap probability levels................................................................... 19
A-III. Example mishap risk assessment values............................................................. 20
A-IV. Example mishap risk categories and mishap risk acceptance levels .................... 20
MIL-STD-882D
1
1. SCOPE
1.1 Scope. This document outlines a standard practice for conducting system safety.
The system safety practice as defined herein conforms to the acquisition procedures in
DoD Regulation 5000.2-R and provides a consistent means of evaluating identified risks.
Mishap risk must be identified, evaluated, and mitigated to a level acceptable (as defined by the
system user or customer) to the appropriate authority and compliant with federal (and state where
applicable) laws and regulations, Executive Orders, treaties, and agreements. Program trade
studies associated with mitigating mishap risk must consider total life cycle cost in any decision.
When requiring MIL-STD-882 in a solicitation or contract and no specific paragraphs of this
standard are identified, then apply only those requirements presented in section 4.
2. APPLICABLE DOCUMENTS
Sections 3, 4, and 5 of this standard contain no applicable documents. This section does not
include documents cited in other sections of this standard or recommended for additional
information or as examples.
3. DEFINITIONS
3.1 Acronyms used in this standard. The acronyms used in this standard are defined as
follows:
a. AMSDL Acquisition Management System & Data Requirement List
b. ANSI American National Standard Institute
c. DID Data Item Description
d. DoD Department of Defense
e. ESH Environmental, Safety, and Health
f. GEIA Government Electronic & Information Technology Association
g. MAIS Major Automated Information System
h. MDAP Major Defense Acquisition Program
i. USAF United States Air Force
3.2 Definitions. Within this document, the following definitions apply (see 6.4):
3.2.1 Acquisition program. A directed, funded effort designed to provide a new,
improved, or continuing system in response to a validated operational need.
3.2.2 Developer. The individual or organization assigned responsibility for a
development effort. Developers can be either internal to the government or contractors.
3.2.3 Hazard. Any real or potential condition that can cause injury, illness, or death to
personnel; damage to or loss of a system, equipment or property; or damage to the environment.
MIL-STD-882D
2
3.2.4 Hazardous material. Any substance that, due to its chemical, physical, or
biological nature, causes safety, public health, or environmental concerns that would require an
elevated level of effort to manage.
3.2.5 Life cycle. All phases of the system's life including design, research, development,
test and evaluation, production, deployment (inventory), operations and support, and disposal.
3.2.6 Mishap. An unplanned event or series of events resulting in death, injury,
occupational illness, damage to or loss of equipment or property, or damage to the environment.
3.2.7 Mishap risk. An expression of the impact and possibility of a mishap in terms of
potential mishap severity and probability of occurrence.
3.2.8 Program Manager (PM). A government official who is responsible for managing
an acquisition program. Also, a general term of reference to those organizations directed by
individual managers, exercising authority over the planning, direction, and control of tasks and
associated functions essential for support of designated systems. This term will normally be
used in lieu of any other titles, e.g.; system support manager, weapon program manager, system
manager, and project manager.
3.2.9 Residual mishap risk. The remaining mishap risk that exists after all mitigation
techniques have been implemented or exhausted, in accordance with the system safety design
order of precedence (see 4.4).
3.2.10 Safety. Freedom from those conditions that can cause death, injury, occupational
illness, damage to or loss of equipment or property, or damage to the environment.
3.2.11 Subsystem. A grouping of items satisfying a logical group of functions within a
particular system.
3.2.12 System. An integrated composite of people, products, and processes that provide
a capability to satisfy a stated need or objective.
3.2.13 System safety. The application of engineering and management principles,
criteria, and techniques to achieve acceptable mishap risk, within the constraints of operational
effectiveness and suitability, time, and cost, throughout all phases of the system life cycle.
3.2.14 System safety engineering. An engineering discipline that employs specialized
professional knowledge and skills in applying scientific and engineering principles, criteria, and
techniques to identify and eliminate hazards, in order to reduce the associated mishap risk.
MIL-STD-882D
3
4. GENERAL REQUIREMENTS
This section defines the system safety requirements to perform throughout the life cycle for any
system, new development, upgrade, modification, resolution of deficiencies, or technology
development. When properly applied, these requirements should ensure the identification and
understanding of all known hazards and their associated risks; and mishap risk eliminated or
reduced to acceptable levels. The objective of system safety is to achieve acceptable mishap risk
through a systematic approach of hazard analysis, risk assessment, and risk management. This
document delineates the minimum mandatory requirements for an acceptable system safety
program for any DoD system. When MIL-STD-882 is required in a solicitation or contract, but
no specific references are included, then only the requirements in this section are applicable.
System safety requirements consist of the following:
4.1 Documentation of the system safety approach. Document the developer's and
program manager's approved system safety engineering approach. This documentation shall:
a. Describe the program¡¯s implementation using the requirements herein. Include
identification of each hazard analysis and mishap risk assessment process used.
b. Include information on system safety integration into the overall program structure.
c. Define how hazards and residual mishap risk are communicated to and accepted by the
appropriate risk acceptance authority (see 4.7) and how hazards and residual mishap risk will be
tracked (see 4.8).
4.2 Identification of hazards. Identify hazards through a systematic hazard analysis
process encompassing detailed analysis of system hardware and software, the environment (in
which the system will exist), and the intended use or application. Consider and use historical
hazard and mishap data, including lessons learned from other systems. Identification of hazards
is a responsibility of all program members. During hazard identification, consider hazards that
could occur over the system life cycle.
4.3 Assessment of mishap risk. Assess the severity and probability of the mishap risk
associated with each identified hazard, i.e., determine the potential negative impact of the hazard
on personnel, facilities, equipment, operations, the public, and the environment, as well as on the
system itself. The tables in Appendix A are to be used unless otherwise specified.
4.4 Identification of mishap risk mitigation measures. Identify potential mishap risk
mitigation alternatives and the expected effectiveness of each alternative or method. Mishap risk
mitigation is an iterative process that culminates when the residual mishap risk has been reduced
to a level acceptable to the appropriate authority. The system safety design order of precedence
for mitigating identified hazards is:
a. Eliminate hazards through design selection. If unable to eliminate an identified
hazard, reduce the associated mishap risk to an acceptable level through design selection.
MIL-STD-882D
4
b. Incorporate safety devices. If unable to eliminate the hazard through design selection,
reduce the mishap risk to an acceptable level using protective safety features or devices.
c. Provide warning devices. If safety devices do not adequately lower the mishap risk of
the hazard, include a detection and warning system to alert personnel to the particular hazard.
d. Develop procedures and training. Where it is impractical to eliminate hazards through
design selection or to reduce the associated risk to an acceptable level with safety and warning
devices, incorporate special procedures and training. Procedures may include the use of personal
protective equipment. For hazards assigned Catastrophic or Critical mishap severity categories,
avoid using warning, caution, or other written advisory as the only risk reduction method.
4.5 Reduction of mishap risk to an acceptable level. Reduce the mishap risk through a
mitigation approach mutually agreed to by both the developer and the program manager.
Communicate residual mishap risk and hazards to the associated test effort for verification.
4.6 Verification of mishap risk reduction. Verify the mishap risk reduction and
mitigation through appropriate analysis, testing, or inspection. Document the determined
residual mishap risk. Report all new hazards identified during testing to the program manager
and the developer.
4.7 Review of hazards and acceptance of residual mishap risk by the appropriate
authority. Notify the program manager of identified hazards and residual mishap risk. Unless
otherwise specified, the suggested tables A-I through A-III of the appendix will be used to rank
residual risk. The program manager shall ensure that remaining hazards and residual mishap risk
are reviewed and accepted by the appropriate risk acceptance authority (ref. table A-IV). The
appropriate risk acceptance authority will include the system user in the mishap risk review. The
appropriate risk acceptance authority shall formally acknowledge and document acceptance of
hazards and residual mishap risk.
4.8 Tracking of hazards, their closures, and residual mishap risk. Track hazards, their
closure actions, and the residual mishap risk. Maintain a tracking system that includes hazards,
their closure actions, and residual mishap risk throughout the system life cycle. The program
manager shall keep the system user advised of the hazards and residual mishap risk.
5. DETAILED REQUIREMENTS
Program managers shall identify in the solicitation and system specification any specific system
safety engineering requirements including risk assessment and acceptance, unique classifications
and certifications (see 6.6 and 6.7), or any mishap reduction needs unique to their program.
Additional information in developing program specific requirements is located in Appendix A.
MIL-STD-882D
5
6. NOTES
(This section contains information of a general or explanatory nature that may be helpful, but is
not mandatory.)
6.1 Intended use. This standard establishes a common basis for expectations of a
properly executed system safety effort.
6.2 Data requirements. Hazard analysis data may be obtained from contracted sources
by citing DI-MISC-80508, Technical Report - Study/Services. When it is necessary to obtain
data, list the applicable Data Item Descriptions (DIDs) on the Contract Data Requirements List
(DD Form 1423), except where the DoD Federal Acquisition Regulation Supplement exempts
the requirement for a DD Form 1423. The developer and the program manager are encouraged
to negotiate access to internal development data when hard copies are not necessary. They are
also encouraged to request that any type of safety plan required to be provided by the
contractor, be submitted with the proposal. It is further requested that any of the below listed
data items be condensed into the statement of work and the resulting data delivered in one
general type scientific report.
Current DIDs, that may be applicable to a system safety effort (check DoD 5010.12-L,
Acquisition Management Systems and Data Requirements Control List (AMSDL) for the most
current version before using), include:
DID Number DID Title
DI-MISC-80043 Ammunition Data Card
DI-SAFT-80101 System Safety Hazard Analysis Report
DI-SAFT-80102 Safety Assessment Report
DI-SAFT-80103 Engineering Change Proposal System Safety Report
DI-SAFT-80104 Waiver or Deviation System Safety Report
DI-SAFT-80105 System Safety Program Progress Report
DI-SAFT-80106 Occupational Health Hazard Assessment
DI-SAFT-80184 Radiation Hazard Control Procedures
DI-MISC-80508 Technical Report - Study Services
DI SAFT-80931 Explosive Ordnance Disposal Data
DI-SAFT-81065 Safety Studies Report
DI-SAFT-81066 Safety Studies Plan
DI-ADMN-81250 Conference Minutes
DI-SAFT-81299 Explosive Hazard Classification Data
DI-SAFT-81300 Mishap Risk Assessment Report
DI-ILSS-81495 Failure Mode, Effects, Criticality Analysis Report
MIL-STD-882D
6
6.3 Subject term (key word) listing.
Environmental
Hazard
Mishap
Mishap probability levels
Mishap risk
Mishap severity categories
Occupational Health
Residual mishap risk
System safety engineering
6.4 Definitions used in this standard. The definitions at 3.2 may be different from
those used in other specialty areas. One must carefully check the specific definition of a term
in question for its area of origination before applying the approach described in this document.
6.5 International standardization agreements. Certain provisions of this standard are
the subject of international standardization agreements (AIR STD 20/23B, Safety Design
Requirements for Airborne Dispenser Weapons, and STANAG No. 3786, Safety Design
Requirements for Airborne Dispenser Weapons). When proposing amendment, revision, or
cancellation of this standard that might modify the international agreement concerned, the
preparing activity will take appropriate action through international standardization channels,
including departmental standardization offices, to change the agreement or make other
appropriate accommodations.
6.6 Explosive hazard classification and characteristic data. Any new or modified item of
munitions or of an explosive nature that will be transported to or stored at a DoD installation or
facility must first obtain an interim or final explosive hazard classification. The system safety
effort should provide the data necessary for the program manager to obtain the necessary
classification(s). These data should include identification of safety hazards involved in handling,
shipping, and storage related to production, use, and disposal of the item.
6.7 Use of system safety data in certification and other specialized safety approvals.
Hazard analyses are often required for many related certifications and specialized reviews.
Examples of activities requiring data generated during a system safety effort include:
a. Federal Aviation Agency airworthiness certification of designs and modifications
b. DoD airworthiness determination
c. Nuclear and non-nuclear munitions certification
d. Flight readiness reviews
e. Flight test safety review board reviews
f. Nuclear Regulatory Commission licensing
g. Department of Energy certification
Special safety-related approval authorities include USAF Radioisotope Committee,
Weapon System Explosive Safety Review Board (Navy), Non-Nuclear Weapons and Explosives
Safety Board (NNWESB), Army Fuze Safety Review Board, Triservice Laser Safety Review
MIL-STD-882D
7
Board, and the DoD Explosive Safety Board. Acquisition agencies should ensure that
appropriate service safety agency approvals are obtained prior to use of new or modified
weapons systems in an operational or test environment.
6.8 DoD acquisition practices. Information on DoD acquisition practices is presented in
the Defense Acquisition Deskbook available from the Deskbook Joint Program Office, Wright-
Patterson Air Force Base, Ohio. Nothing in the referenced information is considered additive to
the requirements provided in this standard.
6.9 Identification of changes. Due to the extent of the changes, marginal notations are
not used in this revision to identify changes with respect to the previous issue.
MIL-STD-882D
APPENDIX A
8
GUIDANCE FOR IMPLEMENTATION OF
A SYSTEM SAFETY EFFORT
A.1 SCOPE
A.1.1 Scope. This appendix provides rationale and guidance to fit the needs of most
system safety efforts. It includes further explanation of the effort and activities available to meet
the requirements described in section 4 of this standard. This appendix is not a mandatory part
of this standard and is not to be included in solicitations by reference. However, program
managers may extract portions of this appendix for inclusion in requirement documents and
solicitations.
A.2 APPLICABLE DOCUMENTS
A.2.1 General. The documents listed in this section are referenced in sections A.3, A.4,
and A.5. This section does not include documents cited in other sections of this appendix or
recommended for additional information or as examples.
A.2.2 Government documents.
A.2.2.1 Specifications, standards, and handbooks. This section is not applicable to this
appendix.
A.2.2.2 Other Government documents, drawings, and publications. The following other
Government document forms a part of this document to the extent specified herein. Unless
otherwise specified, the issue is that cited in the solicitation.
DoD 5000.2-R Mandatory Procedures for Major Defense Acquisition
Programs (MDAPs) and Major Automated Information
System (MAIS) Acquisition Programs
(Copies of DoD 5000.2-R are available from the Washington Headquarters Services,
Directives and Records Branch (Directives Section), Washington, DC or from the DoD
Acquisition Deskbook).
A.2.3 Non-Government publications. This section is not applicable to this appendix.
A.2.4 Order of precedence. Since this appendix is not mandatory, in event of a conflict
between the text of this appendix and the reference cited herein, the text of the reference takes
precedence. Nothing in this appendix supersedes applicable laws and regulations unless a
specific exemption has been obtained.
MIL-STD-882D
APPENDIX A
9
A.3 DEFINITIONS
A.3.1 Acronyms used in this appendix. No additional acronyms are used in this
appendix.
A.3.2 Definitions. Additional definitions that apply to this appendix:
A.3.2.1 Development agreement. The formal documentation of the agreed-upon tasks
that the developer will execute for the program manager. For a commercial developer, this
agreement usually is in the form of a written contract.
A.3.2.2 Fail-safe. A design feature that ensures the system remains safe, or in the event
of a failure, causes the system to revert to a state that will not cause a mishap.
A.3.2.3 Health hazard assessment. The application of biomedical knowledge and
principles to identify and eliminate or control health hazards associated with systems in direct
support of the life-cycle management of materiel items.
A.3.2.4 Mishap probability. The aggregate probability of occurrence of the individual
events/hazards that might create a specific mishap.
A.3.2.5 Mishap probability levels. An arbitrary categorization that provides a
qualitative measure of the most reasonable likelihood of occurrence of a mishap resulting from
personnel error, environmental conditions, design inadequacies, procedural deficiencies, or
system, subsystem, or component failure or malfunction.
A.3.2.6 Mishap risk assessment. The process of characterizing hazards within risk areas
and critical technical processes, analyzing them for their potential mishap severity and
probabilities of occurrence, and prioritizing them for risk mitigation actions.
A.3.2.7 Mishap risk categories. An arbitrary categorization of mishap risk assessment
values often used to generate specific action such as mandatory reporting of certain hazards to
management for action, or formal acceptance of the associated mishap risk.
A.3.2.8 Mishap severity. An assessment of the consequences of the most reasonable
credible mishap that could be caused by a specific hazard.
A.3.2.9 Mishap severity category. An arbitrary categorization that provides a
qualitative measure of the most reasonable credible mishap resulting from personnel error,
environmental conditions, design inadequacies, procedural deficiencies, or system, subsystem, or
component failure or malfunction.
A.3.2.10 Safety critical. A term applied to any condition, event, operation, process, or
item whose proper recognition, control, performance, or tolerance is essential to safe system
operation and support (e.g., safety critical function, safety critical path, or safety critical
component).
MIL-STD-882D
APPENDIX A
10
A.3.2.11 System safety management. All plans and actions taken to identify, assess,
mitigate, and continuously track, control, and document environmental, safety, and health
mishap risks encountered in the development, test, acquisition, use, and disposal of DoD weapon
systems, subsystems, equipment, and facilities.
A.4 GENERAL REQUIREMENTS
A.4.1 General. System safety applies engineering and management principles, criteria,
and techniques to achieve acceptable mishap risk, within the constraints of operational
effectiveness, time, and cost, throughout all phases of the system life cycle. It draws upon
professional knowledge and specialized skills in the mathematical, physical, and scientific
disciplines, together with the principles and methods of engineering design and analysis, to
specify and evaluate the environmental, safety, and health mishap risk associated with a system.
Experience indicates that the degree of safety achieved in a system is directly dependent upon
the emphasis given. The program manager and the developer must apply this emphasis during
all phases of the system's life cycle. A safe design is a prerequisite for safe operations, with the
goal being to produce an inherently safe product that will have the minimum safety-imposed
operational restrictions.
A.4.1.1 System safety in environmental and health hazard management. DoD 5000.2-R
has directed the integration of environmental, safety, and health hazard management into the
systems engineering process. While environmental and health hazard management are normally
associated with the application of statutory direction and requirements, the management of
mishap risk associated with actual environmental and health hazards is directly addressed by the
system safety approach. Therefore, environmental and health hazards can be analyzed and
managed with the same tools as any other hazard, whether they affect equipment, the
environment, or personnel.
A.4.2 Purpose (see 1.1). All DoD program managers shall establish and execute
programs that manage the probability and severity of all hazards for their systems
(DoD 5000.2-R). Provision for system safety requirements and effort as defined by this standard
should be included in all applicable contracts negotiated by DoD. These contracts include those
negotiated within each DoD agency, by one DoD agency for another, and by DoD for other
Government agencies. In addition, each DoD in-house program will address system safety.
A.4.2.1 Solicitations and contracts. Apply the requirements of section 4 to acquisitions.
Incorporate MIL-STD-882 in the list of contractual compliance documents, and include the
potential of a developer to execute section 4 requirements as source selection evaluation criteria.
Developers are encouraged to submit with their proposal a preliminary plan that describes the
system safety effort required for the requested program. When directed by the program manager,
attach this preliminary plan to the contract or reference it within the statement of work; so it
becomes the basis for a contractual system safety program.
A.4.3 System safety planning. Before formally documenting the system safety approach,
the program manager, in concert with systems engineering and associated system safety
MIL-STD-882D
APPENDIX A
11
professionals, must determine what system safety effort is necessary to meet program and
regulatory requirements. This effort will be built around the requirements set forth in section 4
and includes developing a planned approach for safety task accomplishment, providing qualified
people to accomplish the tasks, establishing the authority for implementing the safety tasks
through all levels of management, and allocating appropriate resources to ensure that the safety
tasks are completed.
A.4.3.1 System safety planning subtasks. System safety planning subtasks should:
a. Establish specific safety performance requirements (see A.4.3.2) based on overall
program requirements and system user inputs.
b. Establish a system safety organization or function and the required lines of
communication with associated organizations (government and contractor). Establish interfaces
between system safety and other functional elements of the program, as well as with other safety
and engineering disciplines (such as nuclear, range, explosive, chemical, and biological).
Designate the organizational unit responsible for executing each safety task. Establish the
authority for resolution of identified hazards.
c. Establish system safety milestones and relate these to major program milestones,
program element responsibility, and required inputs and outputs.
d. Establish an incident alerting/notification, investigation, and reporting process, to
include notification of the program manager.
e. Establish an acceptable level of mishap risk, mishap probability and severity
thresholds, and documentation requirements (including but not limited to hazards and residual
mishap risk).
f. Establish an approach and methodology for reporting to the program manager the
following minimum information:
(1) Safety critical characteristics and features.
(2) Operating, maintenance, and overhaul safety requirements.
(3) Measures used to eliminate or mitigate hazards.
(4) Acquisition management of hazardous materials.
g. Establish the method for the formal acceptance and documenting of residual mishap
risks and the associated hazards.
h. Establish the method for communicating hazards, the associated risks, and residual
mishap risk to the system user.
MIL-STD-882D
APPENDIX A
12
i. Specify requirements for other specialized safety approvals (e.g., nuclear, range,
explosive, chemical, biological, electromagnetic radiation, and lasers) as necessary (reference 6.6
and 6.7).
A.4.3.2 Safety performance requirements. These are the general safety requirements
needed to meet the core program objectives. The more closely these requirements relate to a
given program, the more easily the designers can incorporate them into the system. In the
appropriate system specifications, incorporate the safety performance requirements that are
applicable, and the specific risk levels considered acceptable for the system. Acceptable risk
levels can be defined in terms of: a hazard category developed through a mishap risk assessment
matrix; an overall system mishap rate; demonstration of controls required to preclude
unacceptable conditions; satisfaction of specified standards and regulatory requirements; or other
suitable mishap risk assessment procedures. Listed below are examples of safety performance
statements.
a. Quantitative requirements. Quantitative requirements are usually expressed as a
failure or mishap rate, such as "The catastrophic system mishap rate shall not exceed x.xx X 10
-y
per operational hour."
b. Mishap risk requirements. Mishap risk requirements could be expressed as "No
hazards assigned a Catastrophic mishap severity are acceptable." Mishap risk requirements
could also be expressed as a level defined by a mishap risk assessment (see A.4.4.3.2.3), such as
"No Category 3 or higher mishap risks are acceptable."
c. Standardization requirements. Standardization requirements are expressed relative to
a known standard that is relevant to the system being developed. Examples include: "The system
will comply with the laws of the State of XXXXX and be operable on the highways of the State
of XXXXX" or "The system will be designed to meet ANSI Std XXX as a minimum."
A.4.3.3 Safety design requirements. The program manager, in concert with the chief
engineer and utilizing systems engineering and associated system safety professionals, should
establish specific safety design requirements for the overall system. The objective of safety
design requirements is to achieve acceptable mishap risk through a systematic application of
design guidance from standards, specifications, regulations, design handbooks, safety design
checklists, and other sources. Review these for safety design parameters and acceptance criteria
applicable to the system. Safety design requirements derived from the selected parameters, as
well as any associated acceptance criteria, are included in the system specification. Expand these
requirements and criteria for inclusion in the associated follow-on or lower level specifications.
See general safety system design requirements below.
a. Hazardous material use is minimized, eliminated, or associated mishap risks are
reduced through design, including material selection or substitution. When using potentially
hazardous materials, select those materials that pose the least risk throughout the life cycle of the
system.
MIL-STD-882D
APPENDIX A
13
b. Hazardous substances, components, and operations are isolated from other activities,
areas, personnel, and incompatible materials.
c. Equipment is located so that access during operations, servicing, repair, or adjustment
minimizes personnel exposure to hazards (e.g., hazardous substances, high voltage,
electromagnetic radiation, and cutting and puncturing surfaces).
d. Protect power sources, controls, and critical components of redundant subsystems by
physical separation or shielding, or by other acceptable methods.
f. Consider safety devices that will minimize mishap risk (e.g., interlocks, redundancy,
fail safe design, system protection, fire suppression, and protective measures such as clothing,
equipment, devices, and procedures) for hazards that cannot be eliminated. Make provisions for
periodic functional checks of safety devices when applicable.
g. System disposal (including explosive ordnance disposal) and demilitarization are
considered in the design.
h. Implement warning signals to minimize the probability of incorrect personnel reaction
to those signals, and standardize within like types of systems.
i. Provide warning and cautionary notes in assembly, operation, and maintenance
instructions; and provide distinctive markings on hazardous components, equipment, and
facilities to ensure personnel and equipment protection when no alternate design approach can
eliminate a hazard. Use standard warning and cautionary notations where multiple applications
occur. Standardize notations in accordance with commonly accepted commercial practice or, if
none exists, normal military procedures. Do not use warning, caution, or other written advisory
as the only risk reduction method for hazards assigned to Catastrophic or Critical mishap severity
categories.
j. Safety critical tasks may require personnel proficiency; if so, the developer should
propose a proficiency certification process to be used.
k. Severity of injury or damage to equipment or the environment as a result of a mishap
is minimized.
l. Inadequate or overly restrictive requirements regarding safety are not included in the
system specification.
m. Acceptable risk is achieved in implementing new technology, materials, or designs in
an item¡¯s production, test, and operation. Changes to design, configuration, production, or
mission requirements (including any resulting system modifications and upgrades, retrofits,
insertions of new technologies or materials, or use of new production or test techniques) are
accomplished in a manner that maintains an acceptable level of mishap risk. Changes to the
environment in which the system operates are analyzed to identify and mitigate any resulting
hazards or changes in mishap risks.
MIL-STD-882D
APPENDIX A
14
A.4.3.3.1 Some program managers include the following conditions in their solicitation,
system specification, or contract as requirements for the system design. These condition
statements are used optionally as supplemental requirements based on specific program needs.
A.4.3.3.1.1 Unacceptable conditions. The following safety critical conditions are
considered unacceptable for development efforts. Positive action and verified implementation is
required to reduce the mishap risk associated with these situations to a level acceptable to the
program manager.
a. Single component failure, common mode failure, human error, or a design feature that
could cause a mishap of Catastrophic or Critical mishap severity catagories.
b. Dual independent component failures, dual independent human errors, or a
combination of a component failure and a human error involving safety critical command and
control functions, which could cause a mishap of Catastrophic or Critical mishap severity
catagories.
c. Generation of hazardous radiation or energy, when no provisions have been made to
protect personnel or sensitive subsystems from damage or adverse effects.
d. Packaging or handling procedures and characteristics that could cause a mishap for
which no controls have been provided to protect personnel or sensitive equipment.
e. Hazard categories that are specified as unacceptable in the development agreement.
A.4.3.3.1.2 Acceptable conditions. The following approaches are considered acceptable
for correcting unacceptable conditions and will require no further analysis once mitigating
actions are implemented and verified.
a. For non-safety critical command and control functions: a system design that requires
two or more independent human errors, or that requires two or more independent failures, or a
combination of independent failure and human error.
b. For safety critical command and control functions: a system design that requires at
least three independent failures, or three independent human errors, or a combination of three
independent failures and human errors.
c. System designs that positively prevent errors in assembly, installation, or connections
that could result in a mishap.
d. System designs that positively prevent damage propagation from one component to
another or prevent sufficient energy propagation to cause a mishap.
e. System design limitations on operation, interaction, or sequencing that preclude
occurrence of a mishap.
MIL-STD-882D
APPENDIX A
15
f. System designs that provide an approved safety factor, or a fixed design allowance that
limits, to an acceptable level, possibilities of structural failure or release of energy sufficient to
cause a mishap.
g. System designs that control energy build-up that could potentially cause a mishap
(e.g., fuses, relief valves, or electrical explosion proofing).
h. System designs where component failure can be temporarily tolerated because of
residual strength or alternate operating paths, so that operations can continue with a reduced but
acceptable safety margin.
i. System designs that positively alert the controlling personnel to a hazardous situation
where the capability for operator reaction has been provided.
j. System designs that limit or control the use of hazardous materials.
A.4.3.4 Elements of an effective system safety effort. Elements of an effective system
safety effort include:
a. Management is always aware of the mishap risks associated with the system, and
formally documents this awareness. Hazards associated with the system are identified, assessed,
tracked, monitored, and the associated risks are either eliminated or controlled to an acceptable
level throughout the life cycle. Identify and archive those actions taken to eliminate or reduce
mishap risk for tracking and lessons learned purposes.
b. Historical hazard and mishap data, including lessons learned from other systems, are
considered and used.
c. Environmental protection, safety, and occupational health, consistent with mission
requirements, are designed into the system in a timely, cost-effective manner. Inclusion of the
appropriate safety features is accomplished during the applicable phases of the system life cycle.
d. Mishap risk resulting from harmful environmental conditions (e.g., temperature,
pressure, noise, toxicity, acceleration, and vibration) and human error in system operation and
support is minimized.
e. System users are kept abreast of the safety of the system and included in the safety
decision process.
A.4.4 System safety engineering effort. As stated in section 4, a system safety
engineering effort consists of eight main requirements. The following paragraphs provide
further descriptions on what efforts are typically expected due to each of the system safety
requirements listed in section 4.
A.4.4.1 Documentation of the system safety approach. The documentation of the system
safety approach should describe the planned tasks and activities of system safety management
MIL-STD-882D
APPENDIX A
16
and system engineering required to identify, evaluate, and eliminate or control hazards, or to
reduce the residual mishap risk to a level acceptable throughout the system life cycle. The
documentation should describe, as a minimum, the four elements of an effective system safety
effort: a planned approach for task accomplishment, qualified people to accomplish tasks, the
authority to implement tasks through all levels of management, and the appropriate commitment
of resources (both manning and funding) to ensure that safety tasks are completed. Specifically,
the documentation should:
a. Describe the scope of the overall system program and the related system safety effort.
Define system safety program milestones. Relate these to major program milestones, program
element responsibility, and required inputs and outputs.
b. Describe the safety tasks and activities of system safety management and engineering.
Describe the interrelationships between system safety and other functional elements of the
program. List the other program requirements and tasks applicable to system safety and
reference where they are specified or described. Include the organizational relationships
between other functional elements having responsibility for tasks with system safety impacts and
the system safety management and engineering organization including the review and approval
authority of those tasks.
c. Describe specific analysis techniques and formats to be used in qualitative or
quantitative assessments of hazards, their causes, and effects.
d. Describe the process through which management decisions will be made (for example,
timely notification of unacceptable risks, necessary action, incidents or malfunctions, waivers to
safety requirements, and program deviations). Include a description on how residual mishap risk
is formally accepted and this acceptance is documented.
e. Describe the mishap risk assessment procedures, including the mishap severity
categories, mishap probability levels, and the system safety design order of precedence that
should be followed to satisfy the safety requirements of the program. State any qualitative or
quantitative measures of safety to be used for mishap risk assessment including a description of
the acceptable and unacceptable risk levels (if applicable). Include system safety definitions that
modify, deviate from, or are in addition to those in this standard or generally accepted by the
system safety community (see Defense Acquisition Deskbook and System Safety Society¡¯s
System Safety Analysis Handbook) (see A.6.1).
f. Describe how resolution and action relative to system safety will be implemented at
the program management level possessing resolution authority.
g. Describe the verification (e.g., test, analysis, demonstration, or inspection)
requirements for ensuring that safety is adequately attained. Identify any certification
requirements for software, safety devices, or other special safety features (e.g., render safe and
emergency disposal procedures).
MIL-STD-882D
APPENDIX A
17
h. Describe the mishap or incident notification, investigation, and reporting process for
the program, including notification of the program manager.
i. Describe the approach for collecting and processing pertinent historical hazard,
mishap, and safety lessons learned data. Include a description on how a system hazard log is
developed and kept current (see A.4.4.8.1).
j. Describe how the user is kept abreast of residual mishap risk and the associated
hazards.
A.4.4.2 Identification of hazards. Identify hazards through a systematic hazard analysis
process encompassing detailed analysis of system hardware and software, the environment (in
which the system will exist), and the intended usage or application. Historical hazard and
mishap data, including lessons learned from other systems, are considered and used.
A.4.4.2.1 Approaches for identifying hazards. Numerous approaches have been
developed and used to identify system hazards. A key aspect of many of these approaches is
empowering the design engineer with the authority to design safe systems and the responsibility
to identify to program management the hazards associated with the design. Hazard identification
approaches often include using system users in the effort. Commonly used approaches for
identifying hazards can be found in the Defense Acquisition Deskbook and System Safety
Society¡¯s System Safety Analysis Handbook (see A.6.1)
A.4.4.3 Assessment of mishap risk. Assess the severity and probability of the mishap
risk associated with each identified hazard, i.e., determine the potential impact of the hazard on
personnel, facilities, equipment, operations, the public, or environment, as well as on the system
itself. Other factors, such as numbers of persons exposed, may also be used to assess risk.
A.4.4.3.1 Mishap risk assessment tools. To determine what actions to take to eliminate
or control identified hazards, a system of determining the level of mishap risk involved must be
developed. A good mishap risk assessment tool will enable decision makers to properly
understand the level of mishap risk involved, relative to what it will cost in schedule and dollars
to reduce that mishap risk to an acceptable level.
A.4.4.3.2 Tool development. The key to developing most mishap risk assessment tools
is the characterization of mishap risks by mishap severity and mishap probability. Since the
highest system safety design order of precedence is to eliminate hazards by design, a mishap risk
assessment procedure considering only mishap severity will generally suffice during the early
design phase to minimize the system¡¯s mishap risks (for example, just don¡¯t use hazardous or
toxic material in the design). When all hazards cannot be eliminated during the early design
phase, a mishap risk assessment procedure based upon the mishap probability as well as the
mishap severity provides a resultant mishap risk assessment. The assessment is used to establish
priorities for corrective action, resolution of identified hazards, and notification to management
of the mishap risks. The information provided here is a suggested tool and set of definitions that
can be used. Program managers can develop tools and definitions appropriate to their individual
programs.
MIL-STD-882D
APPENDIX A
18
A.4.4.3.2.1 Mishap severity. Mishap severity categories are defined to provide a
qualitative measure of the most reasonable credible mishap resulting from personnel error,
environmental conditions, design inadequacies, procedural deficiencies, or system, subsystem, or
component failure or malfunction. Suggested mishap severity categories are shown in Table A-I.
The dollar values shown in this table should be established on a system by system basis
depending on the size of the system being considered to reflect the level of concern.
TABLE A-I. Suggested mishap severity categories.
Description Category Environmental, Safety, and Health Result Criteria
Catastrophic I Could result in death, permanent total disability, loss
exceeding $1M, or irreversible severe environmental
damage that violates law or regulation.
Critical II Could result in permanent partial disability, injuries
or occupational illness that may result in
hospitalization of at least three personnel, loss
exceeding $200K but less than $1M, or reversible
environmental damage causing a violation of law or
regulation.
Marginal III Could result in injury or occupational illness
resulting in one or more lost work days(s), loss
exceeding $10K but less than $200K, or mitigatible
environmental damage without violation of law or
regulation where restoration activities can be
accomplished.
Negligible IV Could result in injury or illness not resulting in a lost
work day, loss exceeding $2K but less than $10K, or
minimal environmental damage not violating law or
regulation.
NOTE: These mishap severity categories provide guidance to a wide variety of programs.
However, adaptation to a particular program is generally required to provide a mutual
understanding between the program manager and the developer as to the meaning of the terms
used in the category definitions. Other risk assessment techniques may be used provided that
the user approves them.
A.4.4.3.2.2 Mishap probability. Mishap probability is the probability that a mishap
will occur during the planned life expectancy of the system. It can be described in terms of
potential occurrences per unit of time, events, population, items, or activity. Assigning a
quantitative mishap probability to a potential design or procedural hazard is generally not
possible early in the design process. At that stage, a qualitative mishap probability may be
MIL-STD-882D
APPENDIX A
19
derived from research, analysis, and evaluation of historical safety data from similar systems.
Supporting rationale for assigning a mishap probability is documented in hazard analysis
reports. Suggested qualitative mishap probability levels are shown in Table A-II.
TABLE A-II. Suggested mishap probability levels.
Description* Level Specific Individual Item Fleet or Inventory**
Frequent A Likely to occur often in the
life of an item, with a
probability of occurrence
greater than 10
-1
in that life.
Continuously
experienced.
Probable B Will occur several times in the
life of an item, with a
probability of occurrence less
than 10
-1
but greater than 10
-2
in that life.
Will occur frequently.
Occasional C Likely to occur some time in
the life of an item, with a
probability of occurrence less
than 10
-2
but greater than 10
-3
in that life.
Will occur several
times.
Remote D Unlikely but possible to occur
in the life of an item, with a
probability of occurrence less
than 10
-3
but greater than 10
-6
in that life.
Unlikely, but can
reasonably be
expected to occur.
Improbable E So unlikely, it can be assumed
occurrence may not be
experienced, with a
probability of occurrence less
than 10
-6
in that life.
Unlikely to occur, but
possible.
*Definitions of descriptive words may have to be modified based on quantity of items
involved.
**The expected size of the fleet or inventory should be defined prior to accomplishing an
assessment of the system.
A.4.4.3.2.3 Mishap risk assessment. Mishap risk classification by mishap severity and
mishap probability can be performed by using a mishap risk assessment matrix. This
assessment allows one to assign a mishap risk assessment value to a hazard based on its mishap
severity and its mishap probability. This value is then often used to rank different hazards as to
their associated mishap risks. An example of a mishap risk assessment matrix is shown at
Table A-III.
MIL-STD-882D
APPENDIX A
20
TABLE A-III. Example mishap risk assessment values.
SEVERITY
PROBABILITY
Catastrophic Critical Marginal Negligible
Frequent 1 3 7 13
Probable 2 5 9 16
Occasional 4 6 11 18
Remote 8 10 14 19
Improbable 12 15 17 20
A.4.4.3.2.4 Mishap risk categories. Mishap risk assessment values are often used in
grouping individual hazards into mishap risk categories. Mishap risk categories are then used
to generate specific action such as mandatory reporting of certain hazards to management for
action or formal acceptance of the associated mishap risk. Table A-IV includes an example
listing of mishap risk categories and the associated assessment values. In the example, the
system management has determined that mishap risk assessment values 1 through 5 constitute
¡°High¡± risk while values 6 through 9 constitute ¡°Serious¡± risk.
TABLE A-IV. Example mishap risk categories and mishap risk acceptance levels.
Mishap Risk
Assessment Value
Mishap Risk Category Mishap Risk Acceptance
Level
1 ¨C 5 High Component Acquisition
Executive
6 ¨C 9 Serious Program Executive Officer
10 ¨C 17 Medium Program Manager
18 ¨C 20 Low As directed
*Representative mishap risk acceptance levels are shown in the above table. Mishap risk
acceptance is discussed in paragraph A.4.4.7. The using organization must be consulted by the
corresponding levels of program management prior to mishap risk acceptance.
A.4.4.3.2.5 Mishap risk impact. The mishap risk impact is assessed, as necessary,
using other factors to discriminate between hazards having the same mishap risk value. One
might discriminate between hazards with the same mishap risk assessment value in terms of
mission capabilities, or social, economic, and political factors. Program management will
closely consult with the using organization on the decisions used to prioritize resulting actions.
A.4.4.3.3 Mishap risk assessment approaches. Commonly used approaches for assessing
mishap risk can be found in the Defense Acquisition Deskbook and System Safety Society¡¯s
System Safety Analysis Handbook (see A.6.1)
MIL-STD-882D
APPENDIX A
21
A.4.4.4 Identification of mishap risk mitigation measures. Identify potential mishap risk
mitigation alternatives and the expected effectiveness of each alternative or method. Mishap risk
mitigation is an iterative process that culminates when the residual mishap risk has been reduced
to a level acceptable to the appropriate authority.
A.4.4.4.1 Prioritize hazards for corrective action. Hazards should be prioritized so that
corrective action efforts can be focused on the most serious hazards first. A categorization of
hazards may be conducted according to the mishap risk potential they present.
A.4.4.4.2 System safety design order of precedence (see 4.4). The ultimate goal of a
system safety program is to design systems that contain no hazards. However, since the nature
of most complex systems makes it impossible or impractical to design them completely hazardfree, a successful system safety program often provides a system design where there exist no
hazards resulting in an unacceptable level of mishap risk. As hazard analyses are performed,
hazards will be identified that will require resolution. The system safety design order of
precedence defines the order to be followed for satisfying system safety requirements and
reducing risks. The alternatives for eliminating the specific hazard or controlling its associated
risk are evaluated so that an acceptable method for mishap risk reduction can be agreed to.
A.4.4.5 Reduction of mishap risk to an acceptable level. Reduce the system mishap risk
through a mitigation approach mutually agreed to by the developer, program manager and the
using organization.
A.4.4.5.1 Communication with associated test efforts. Residual mishap risk and
associated hazards must be communicated to the system test efforts for verification.
A.4.4.6 Verification of mishap risk reduction. Verify the mishap risk reduction and
mitigation through appropriate analysis, testing, or inspection. Document the determined
residual mishap risk. The program manager must ensure that the selected mitigation approaches
will result in the expected residual mishap risk. To provide this assurance, the system test effort
should verify the performance of the mitigation actions. New hazards identified during testing
must be reported to the program manager and the developer.
A.4.4.6.1 Testing for a safe design. Tests and demonstrations must be defined to
validate selected safety features of the system. Test or demonstrate safety critical equipment and
procedures to determine the mishap severity or to establish the margin of safety of the design.
Consider induced or simulated failures to demonstrate the failure mode and acceptability of
safety critical equipment. When it cannot be analytically determined whether the corrective
action taken will adequately control a hazard, conduct safety tests to evaluate the effectiveness of
the controls. Where costs for safety testing would be prohibitive, safety characteristics or
procedures may be verified by engineering analyses, analogy, laboratory test, functional
mockups, or subscale/model simulation. Integrate testing of safety systems into appropriate
system test and demonstration plans to the maximum extent possible.
MIL-STD-882D
APPENDIX A
22
A.4.4.6.2 Conducting safe testing. The program manager must ensure that test teams are
familiar with mishap risks of the system. Test plans, procedures, and test results for all tests
including design verification, operational evaluation, production acceptance, and shelf-life
validation should be reviewed to ensure that:
a. Safety is adequately demonstrated.
b. The testing will be conducted in a safe manner.
c. All additional hazards introduced by testing procedures, instrumentation, test
hardware, and test environment are properly identified and controlled.
A.4.4.6.3 Communication of new hazards identified during testing. Testing
organizations must ensure that hazards and safety discrepancies discovered during testing are
communicated to the program manager and the developer.
A.4.4.7 Review and acceptance of residual mishap risk by the appropriate authority.
Notify the program manager of identified hazards and residual mishap risk. For long duration
programs, incremental or periodic reporting should be used.
A.4.4.7.1 Residual mishap risk. The mishap risk that remains after all planned mishap
risk management measures have been implemented is considered residual mishap risk. Residual
mishap risk is documented along with the reason(s) for incomplete mitigation.
A.4.4.7.2 Residual mishap risk management. The program manager must know what
residual mishap risk exists in the system being acquired. For significant mishap risks, the
program manager is required to elevate reporting of residual mishap risk to higher levels of
appropriate authority (such as the Program Executive Officer or Component Acquisition
Executive) for action or acceptance. The program manager is encouraged to apply additional
resources or other remedies to help the developer satisfactorily resolve hazards providing
significant mishap risk. Table A-IV includes an example of a mishap risk acceptance level
matrix based on the mishap risk assessment value and mishap risk category.
A.4.4.7.3 Residual mishap risk acceptance. The program manager is responsible for
formally documenting the acceptance of the residual mishap risk of the system by the appropriate
authority. The program manager should update this residual mishap risk and the associated
hazards to reflect changes/modifications in the system or its use. The program manager and
using organization should jointly determine the updated residual mishap risk prior to acceptance
of the risk and system hazards by the risk acceptance authority, and should document the
agreement between the user and the risk acceptance authority.
A.4.4.8 Tracking hazards and residual mishap risk. Track hazards, their closures, and
residual mishap risk. A tracking system for hazards, their closures, and residual mishap risk
must be maintained throughout the system life cycle. The program manager must keep the
system user apprised of system hazards and residual mishap risk.
MIL-STD-882D
APPENDIX A
23
A.4.4.8.1 Process for tracking of hazards and residual mishap risk. Each system must
have a current log of identified hazards and residual mishap risk, including an assessment of the
residual mishap risk (see A.4.4.7). As changes are integrated into the system, this log is updated
to incorporate added or changed hazards and the associated residual mishap risk. The
Government must formally acknowledge acceptance of system hazards and residual mishap risk.
Users will be kept informed of hazards and residual mishap risk associated with their systems.
A.4.4.8.1.1 Developer responsibilities for communications, acceptance, and tracking of
hazards and residual mishap risk. The developer (see 3.2.2) is responsible for communicating
information to the program manager on system hazards and residual mishap risk, including any
unusual consequences and costs associated with hazard mitigation. After attempting to eliminate
or mitigate system hazards, the developer will formally document and notify the program
manager of all hazards breaching thresholds set in the safety design criteria. At the same time,
the developer will also communicate the system residual mishap risk.
A.4.4.8.1.2 Program manager responsibilities for communications, acceptance, and
tracking of hazards and residual mishap risk. The program manager is responsible for
maintaining a log of all identified hazards and residual mishap risk for the system. The program
manager will communicate known hazards and associated risks of the system to all system
developers and users. As changes are integrated into the system, the program manager shall
update this log to incorporate added or changed hazards and the residual mishap risk identified
by the developer. The program manager is also responsible for informing system developers
about the program manager¡¯s expectations for handling of newly discovered hazards. The
program manager will evaluate new hazards and the resulting residual mishap risk, and either
recommend further action to mitigate the hazards, or formally document the acceptance of these
hazards and residual mishap risk. The program manager will evaluate the hazards and associated
residual mishap risk in close consultation and coordination with the ultimate end user, to assure
that the context of the user requirements, potential mission capability, and the operational
environment are adequately addressed. Copies of the documentation of the hazard and risk
acceptance will be provided to both the developer and the system user. Hazards for which the
program manager accepts responsibility for mitigation will also be included in the formal
documentation. For example, if the program manager decides to execute a special training
program to mitigate a potentially hazardous situation, this approach will be documented in the
formal response to the developer. Residual mishap risk and hazards must be communicated to
system test efforts for verification.
A.5 SPECIFIC REQUIREMENTS
A.5.1 Program manager responsibilities. The program manager must ensure that all
types of hazards are identified, evaluated, and mitigated to a level compliant with acquisition
management policy, federal (and state where applicable) laws and regulations, Executive Orders,
treaties, and agreements. The program manager should:
A.5.1.1 Establish, plan, organize, implement, and maintain an effective system safety
effort that is integrated into all life cycle phases.
MIL-STD-882D
APPENDIX A
24
A.5.1.2 Ensure that system safety planning is documented to provide all program
participants with visibility into how the system safety effort is to be conducted.
A.5.1.3 Establish definitive safety requirements for the procurement, development, and
sustainment of the system. The requirements should be set forth clearly in the appropriate
system specifications and contractual documents.
A.5.1.4 Provide historical safety data to developers.
A.5.1.5 Monitor the developer¡¯s system safety activities and review and approve
delivered data in a timely manner, if applicable, to ensure adequate performance and compliance
with safety requirements.
A.5.1.6 Ensure that the appropriate system specifications are updated to reflect results of
analyses, tests, and evaluations.
A.5.1.7 Evaluate new lessons learned for inclusion into appropriate databases and submit
recommendations to the responsible organization.
A.5.1.8 Establish system safety teams to assist the program manager in developing and
implementing a system safety effort.
A.5.1.9 Provide technical data on Government-furnished Equipment or Governmentfurnished Property to enable the developer to accomplish the defined tasks.
A.5.1.10 Document acceptance of residual mishap risk and associated hazards.
A.5.1.11 Keep the system users apprised of system hazards and residual mishap risk.
A.5.1.12 Ensure the program meets the intent of the latest MIL-STD 882.
A.5.1.13 Ensure adequate resources are available to support the program system safety
effort.
A.5.1.14 Ensure system safety technical and managerial personnel are qualified and
certified for the job.
A.6 NOTES
A.6.1 DoD acquisition practices and safety analysis techniques. Information on DoD
acquisition practices and safety analysis techniques is available at the referenced Internet sites.
Nothing in the referenced information is considered binding or additive to the requirements
provided in this standard.
A.6.1.1 Defense Acquisition Deskbook. Wright-Patterson Air Force Base, Ohio:
Deskbook Joint Program Office.
MIL-STD-882D
APPENDIX A
25
A.6.1.2 System Safety Analysis Handbook. Unionville, VA: System Safety Society.
MIL-STD-882D
26
CONCLUDING MATERIAL
Custodians: Preparing activity:
Army - AV Air Force - 40
Navy - AS
Air Force ¨C 40 Project SAFT - 0038
Reviewing activities:
Army - AR, AT, CR, MI
Navy - EC, OS, SA, SH
Air Force - 10, 11, 13, 19
STANDARDIZATION DOCUMENT IMPROVEMENT PROPOSAL
INSTRUCTIONS
1. The preparing activity must complete blocks 1, 2, 3, and 8. In block 1, both the document number and revision letter
should be given.
2. The submitter of this form must complete blocks 4, 5, 6, and 7, and send to preparing activity.
3 The preparing activity must provide a reply within 30 days from receipt of the form.
NOTE: This form may not be used to request copies of documents, nor to request waivers, or clarification of
requirements on current contracts. Comments submitted on this form do not constitute or imply authorization to waive any
portion of the referenced document(s) or to amend contractual requirements.
I RECOMMEND A CHANGE:
1. DOCUMENT NUMBER
MIL-STD-882
2. DOCUMENT DATE (YYYYMMDD)
20000210
3. DOCUMENT TITLE
System Safety
4. NATURE OF CHANGE (Identify paragraph number and include proposed rewrite, if possible. Attach extra sheets as needed.)
5. REASON FOR RECOMMENDATION
6. SUBMITTER
a. NAME (Last, First, Middle Initial) b. ORGANIZATION
c. ADDRESS (Include zip code) d. TELEPHONE (Include Area Code)
(1) Commercial
(2) DSN
(if applicable)
7. DATE SUBMITTED
(YYYYMMDD)
8. PREPARING ACTIVITY
a. NAME
Headquarters, Air Force Materiel Command
System Safety Division
b. TELEPHONE (Include Area Code)
(1) Commercial (937) 257-6007
(2) DSN 787-6007
b. ADDRESS (Include Zip Code)
HQ AFMC/SES
4375 Chidlaw Road
Wright Patterson AFB, Ohio 45433-5006
IF YOU DO NOT RECEIVE A REPLY WITHIN 45 DAYS, CONTACT:
Defense Standardization Program Office (DLSC-LM)
8725 John J. Kingman Road, Suite 2533
Fort Belvoir, Virginia 22060-6621
Telephone 703 767-6888 DSN 427-6888
DD Form 1426, FEB 1999 (EG) PREVIOUS EDITION IS OBSOLETE. WHS/DIOR, Feb 99
Ò³:
[1]