What Is a Hazard Analysis?
A hazard analysis is a structured process for identifying conditions or events that could cause harm—to people, property, the environment, or mission success—and determining what the system must do to prevent or mitigate them. The output is not a report that gets filed. It’s a set of inputs to the requirements process: every identified hazard should eventually be addressed by at least one verifiable safety requirement.
That connection—from hazard to requirement—is where most programs succeed or fail. The analysis itself can be rigorous, the requirements can be well-formed, and the program can still ship an unsafe system if the two artifacts are managed in isolation and allowed to drift apart. Understanding hazard analysis means understanding both the methods and the traceability infrastructure that makes them operationally useful.
The Core Concept: Hazards, Causes, and Effects
Before comparing methods, it helps to be precise about terminology, because different standards use these terms inconsistently.
A hazard is a system state or condition that, combined with an initiating event or environmental factor, can lead to an accident. A hazard is not the accident itself—it’s the precondition. A pressurized fuel line near an ignition source is a hazard. An explosion is the accident.
A cause is what produces or enables the hazardous state. Causes can be hardware failures, software errors, human errors, environmental conditions, or design oversights.
An effect is the consequence of the accident if the hazard is not controlled. Effects are typically described in terms of severity—from negligible to catastrophic—and are used to prioritize which hazards demand the most rigorous mitigation.
A mitigation is any design feature, operational procedure, or constraint that reduces the probability or severity of the accident. Mitigations become, in practice, the basis for safety requirements.
Every hazard analysis method structures these four elements differently. The choice of method depends on the system type, the nature of likely failures, the maturity of the design, and the applicable standard (DO-178C, ISO 26262, IEC 61508, MIL-STD-882E, and others specify or prefer different methods).
Four Methods, Four Perspectives
FMEA: Bottom-Up Failure Analysis
Failure Mode and Effects Analysis (FMEA) starts at the component level and asks: for each component, what are its possible failure modes, and what effect does each failure mode have on the system?
The process is systematic and exhaustive. You enumerate every component, list every way it can fail (open circuit, short circuit, stuck closed, stuck open, degraded output, and so on), assign severity and probability ratings to each failure mode, and calculate a Risk Priority Number (RPN) or equivalent ranking. High-RPN failure modes become candidates for design changes or explicit safety requirements.
FMEA is well-suited to hardware-dominated systems with well-understood failure modes. It integrates naturally with reliability analysis (FMECA adds criticality analysis on top of FMEA) and maps cleanly to component-level requirements. Its weakness is that it handles interactions between components poorly. A system can pass FMEA with no high-severity single-point failures and still fail due to a combination of mid-severity failures that the component-by-component analysis never surfaces.
FTA: Top-Down Fault Logic
Fault Tree Analysis (FTA) inverts the direction. You start with an undesired top-level event—a hazardous system state—and decompose it into necessary and sufficient causes using Boolean logic (AND gates, OR gates, and their combinations). The result is a fault tree: a graphical model showing every combination of component failures and human errors that can produce the top-level event.
FTA is powerful for understanding accident causation at the system level and for identifying minimal cut sets—the smallest combinations of failures that lead to the top event. It’s a prerequisite or complement to probabilistic risk assessment (PRA) in nuclear, aerospace, and defense applications.
The limitation of FTA is that it requires significant design maturity. You need to know what the system architecture looks like before you can construct a meaningful fault tree. Early in conceptual design, FTA is premature.
HAZOP: Process Deviation Analysis
Hazard and Operability Study (HAZOP) was developed for chemical process industries and remains the dominant method in oil and gas, pharmaceutical, and industrial automation. It applies a structured set of guide words—NO, MORE, LESS, AS WELL AS, PART OF, REVERSE, OTHER THAN—to process parameters (flow, temperature, pressure, composition) to generate deviations from design intent.
Each deviation is analyzed for its causes and consequences, and the team identifies whether existing safeguards are adequate or whether additional requirements are needed. HAZOP is a team-based method; it requires a facilitator, a scribe, and subject matter experts from operations, process engineering, and safety.
HAZOP is thorough for piping-and-instrumentation-level analysis but does not scale naturally to software-intensive or cyber-physical systems where the failure modes are behavioral rather than parametric. Extensions like HAZOP for software exist but are less standardized.
STPA: Control-Theoretic Safety Analysis
Systems-Theoretic Process Analysis (STPA), developed at MIT by Nancy Leveson, takes a fundamentally different view of how accidents occur. Rather than modeling systems as collections of components that fail, STPA models systems as control structures—hierarchies of controllers, controlled processes, feedback loops, and actuators. Accidents are caused not by component failure alone but by inadequate control: a controller issues an unsafe control action, or a safe control action is not issued, or a control action is issued with wrong timing, duration, or order.
STPA identifies Unsafe Control Actions (UCAs) systematically and derives safety constraints directly from them. Those safety constraints become the basis for safety requirements. The method handles software, human operators, and organizational factors in a unified framework—something FMEA and FTA struggle to do.
STPA is increasingly favored in autonomous systems, aerospace, automotive (particularly ADAS), and defense programs where software and human-machine interaction dominate the accident space. Its learning curve is steeper than FMEA, and its outputs require some translation before they map cleanly into traditional requirement formats. But for complex, software-intensive systems, it produces more relevant safety requirements than any purely hardware-failure-oriented method.
The Relationship Between Hazard Analysis and Safety Requirements
A hazard analysis that does not produce traceable safety requirements has limited engineering value. It may satisfy an audit, but it does not improve the safety of the system.
The connection works like this:
- The hazard analysis identifies a hazard (or unsafe control action, or fault, depending on method).
- A mitigation strategy is determined—avoid, eliminate, reduce probability, reduce severity, detect and respond.
- The mitigation strategy becomes a safety requirement: a verifiable statement of what the system must do or must not do.
- That requirement is allocated to a subsystem or component.
- Design and verification activities close the loop by demonstrating compliance.
This chain is typically managed in a hazard tracking log or safety analysis report alongside a requirements management tool. In most organizations, those two artifacts live in different places—an Excel sheet or dedicated safety analysis tool for the hazard log, and IBM DOORS, Jama Connect, or Polarion for requirements. The gap between them is maintained manually, through document references, change control processes, and periodic reconciliation reviews.
Manual maintenance of this gap is where programs fail. A hazard is added late in the program, a mitigation is identified, and a requirement is drafted—but it never gets formally linked in the requirements database. A requirement is deleted or modified during a design change, and no one updates the hazard log to reflect that a mitigation is now partially or fully absent. The artifacts diverge. The safety case degrades silently.
How Modern Tools Implement the Hazard-to-Requirement Link
The underlying problem is representational: document-based and table-based requirements tools treat requirements as rows in a spreadsheet, linked by text references. Hazard analysis outputs are similarly structured. Connecting them requires a layer of manual bookkeeping that does not scale and does not survive schedule pressure.
Graph-based tools model both requirements and their relationships as nodes and edges in a connected structure. A hazard, a mitigation, a requirement, a test, and a design element can all be nodes; the relationships between them—“hazard X is mitigated by requirement Y,” “requirement Y is verified by test Z,” “test Z is allocated to subsystem W”—are explicit, typed edges. Queries and impact analyses run on the graph, not on text searches through flat tables.
This is the architectural difference that matters for hazard-to-requirement traceability. When a hazard is modified, a graph-based model can surface every requirement that addresses it and every test that verifies those requirements. When a requirement is deleted, the model can flag any hazards that no longer have complete mitigation coverage.
Flow Engineering, built specifically for hardware and systems engineering teams, models this relationship natively. Hazard analysis outputs can be represented as nodes with explicit typed links to the safety requirements that address them. When the design evolves, the graph makes the impact visible immediately—which mitigations are affected, which requirements need review, which verification activities need to be updated. The hazard log and the requirement set are not separate documents maintained in parallel; they are different views of the same connected model.
This matters most during system evolution. Requirements churn is highest during PDR-to-CDR, exactly when safety analysis outputs are being incorporated. A tool that can propagate change impact through a hazard-to-requirement-to-verification chain in one operation reduces the audit burden and reduces the risk that a mitigation quietly disappears.
Practical Starting Points
If your program is early in conceptual design and requirements are not yet baselined, start with STPA. The method forces clarity about what the system is supposed to control and what control failures look like. Its outputs—safety constraints derived from unsafe control actions—translate directly into system-level requirements and are well-suited to AI-enabled and software-intensive systems.
If your program has a defined hardware architecture and you need component-level failure analysis for a reliability or safety case, use FMEA. Supplement it with FTA for system-level failure combinations if your standard requires probabilistic analysis.
If you are working on a process system with well-defined parameters and established P&IDs, HAZOP is the appropriate primary method.
Regardless of method, build the traceability link from hazard to requirement before the program reaches PDR. A hazard log that accumulates without being connected to a living requirement set will not improve at CDR—it will only be harder to reconcile.
The hazard analysis method is a choice. The link to requirements is not optional.