How Do You Write Requirements for a System That Needs to Fail Safely When You Don’t Yet Know All the Ways It Can Fail?

Start with the honest answer: you cannot fully write those requirements. Not at the beginning. Anyone who tells you otherwise is either working on a thoroughly understood system class or has confused documentation completeness with engineering completeness. The challenge at the frontier of safety engineering — autonomous vehicles, medical robotics, industrial autonomous systems, next-generation avionics — is precisely that the hazard space is not fully visible when requirements work must begin. You have partial knowledge, growing models, and a regulatory obligation to have a coherent safety argument by the time you ship.

That tension is not a process failure. It is the actual nature of safety-critical systems development, and the engineering discipline of functional safety exists specifically to manage it. The question is not how to avoid writing requirements before you know everything. The question is how to write requirements that are valid under current knowledge, structured to accept updates, and traceable in a way that doesn’t collapse when the hazard picture changes.

What Hazard Analysis Actually Does (and Doesn’t) Give You

Hazard analysis is the starting point because it converts abstract risk into specific, bounded obligations. But it is worth being precise about what different methods produce.

Preliminary Hazard Analysis (PHA) is the earliest-stage tool. You apply it before detailed design, using system-level functional descriptions and operational context. PHA identifies hazardous states and conditions at a coarse level — not the component failures that cause them, but the system-level situations that are unacceptable. A PHA for a surgical robot might identify “unintended instrument motion during incision” as a hazardous event, without yet knowing which software fault, sensor failure, or communication dropout could produce it. PHA gives you a hazard list and a preliminary severity and likelihood assessment. It does not give you complete requirements, but it gives you the right questions.

Hazard Analysis and Risk Assessment (HARA), as defined in ISO 26262, is the automotive-domain formalization of this process. HARA classifies each hazardous event by severity (S), exposure (E), and controllability (C), and from that classification assigns an Automotive Safety Integrity Level (ASIL) — from A (least stringent) to D (most stringent), with QM meaning no safety integrity requirement. The output of HARA is not a list of requirements. It is a list of safety goals, each carrying an ASIL, each stating what the system must do or avoid at the functional level to prevent an unacceptable hazard.

That distinction matters. Safety goals like “the system shall not apply more than X Nm of torque to joint Y without operator confirmation” are functional-level obligations that predate any architectural decision. They are the top of the requirements hierarchy. Everything below them — functional safety requirements, technical safety requirements, hardware and software safety requirements — must be traceable to a safety goal. And each safety goal is traceable to a specific hazard identified in HARA.

This is the first traceability chain that must be established and maintained: hazard → safety goal → safety requirements hierarchy.

The Iterative Nature of Safety Requirements Development

Here is where the process becomes uncomfortable for teams accustomed to linear V-model thinking. The safety requirements hierarchy is not written top-down in sequence. It is built iteratively as the design matures and as analysis tools reveal failure modes that were not visible at the outset.

FMEA (Failure Mode and Effects Analysis) operates at the component or function level. For each element of the system — whether hardware component, software function, or interface — you systematically enumerate possible failure modes, their effects at the system level, and their severity. FMEA surfaces failure modes that feed back into the safety analysis. A failure mode that produces a top-level hazardous state means the existing safety goal must be refined or a new one added. A failure mode that reveals an uncovered effect means the PHA or HARA was incomplete, and the safety case must be updated.

FTA (Fault Tree Analysis) works top-down from an undesired top-level event — typically derived from your hazard list — and decomposes it into contributing causes using Boolean logic. FTA tells you the combinations of component failures or software faults that could produce a system-level hazard. This is where single-point failures, common-cause failures, and latent fault chains become visible. Each identified fault path must be addressed either by design (eliminating the path) or by a safety requirement that detects, tolerates, or mitigates the fault.

HAZOP (Hazard and Operability Study) applies structured guideword-based analysis — more, less, none, reverse, other than — to process flows, data flows, or operational sequences. It is particularly effective for revealing hazards that arise not from component failures but from operational variability, boundary conditions, and unexpected interactions. HAZOP consistently surfaces failure modes that FMEA misses because FMEA focuses on component failures while HAZOP focuses on process deviations.

Each of these methods is not a one-time exercise. As the design evolves, as the architecture gets detailed, as interfaces are specified and subsystems are allocated, new failure modes emerge. Each one is a potential driver of new safety requirements or a revision to existing ones.

This is the core challenge: the safety requirements set is a living artifact, not a baseline that gets frozen early. And the safety argument — the structured claim that the system meets its safety goals — must remain coherent as that artifact evolves.

The Traceability Problem at Scale

Consider what “traceability” means in a real safety program. You have a set of identified hazards. Each hazard maps to one or more safety goals. Each safety goal is allocated to one or more functional safety requirements. Each functional safety requirement is refined into technical safety requirements at the system level, then allocated to hardware and software requirements. The software requirements drive verification tests. The hardware requirements drive qualification activities. Every one of those links is a claim in the safety argument.

Now a mid-program FMEA reveals a latent common-cause failure mode that was not captured in the original HARA. The analysis team updates the HARA, adds a new safety goal, and derives two new functional safety requirements. One of those requirements affects an interface that was already specified. The interface requirements must be updated. The software safety requirements that handle that interface must be reviewed. The test cases that verify those requirements must be extended.

In a document-based requirements environment — Word documents, spreadsheets, or legacy tools like IBM DOORS with requirements stored as text in a flat hierarchy — this ripple is invisible unless someone manually traces it. Engineers often don’t trace it fully, not because they are careless, but because the tooling makes it prohibitively laborious. The result is a safety case with silent gaps: requirements that were valid against the original hazard set but have not been updated to reflect the current one.

This is not a hypothetical failure mode. It is one of the most common findings in automotive, aerospace, and medical device safety audits. The hazard analysis and the requirements set have drifted apart.

How a Living Requirements Platform Changes This

A requirements platform built around graph-based traceability rather than document structure handles this differently. When a safety goal is updated or a new one added, the links to dependent requirements are visible immediately. Impact analysis is not a manual review — it is a query against a traceability graph. Which requirements trace to this safety goal? Which are now potentially affected by this change? Which test cases need review?

Flow Engineering is designed around exactly this model. Rather than treating requirements as text stored in a document hierarchy, it represents the entire requirements set as a connected model — nodes (requirements, hazards, safety goals, test cases, design elements) linked by typed relationships (derived-from, allocated-to, verified-by, mitigated-by). When the hazard analysis is updated, the downstream impact propagates through the graph. Engineers can see which parts of the safety case are potentially invalidated without running a manual impact assessment.

This matters most not at the beginning of a program, when the requirements set is small and the connections are manageable, but at the mid-to-late stages, when there are thousands of requirements and hundreds of traced relationships. That is when safety teams are most vulnerable to traceability drift, and when the gap between “we have a document” and “we have a coherent safety argument” becomes consequential.

Flow Engineering also supports the iterative analysis cycle directly. As FMEA worksheets, FTA models, and HAZOP reports are developed, their outputs — identified failure modes, fault paths, process deviations — can be linked directly to requirements as they drive changes. The requirement change history is preserved. The reason a requirement was added, modified, or tightened is part of the record. That audit trail is the safety argument.

A deliberate trade-off in Flow Engineering’s design: it is not a full-function process safety analysis tool. It does not replace dedicated FMEA software or FTA modeling tools. It is the connective layer between those analyses and the requirements set — the place where the outputs of analysis become requirements, and where those requirements remain traceable as the analysis evolves.

Practical Starting Points for Safety Requirements Teams

The following sequence will not eliminate the uncertainty inherent in early-stage safety programs, but it structures the work so that uncertainty is managed rather than ignored.

Start with PHA before you have a design. Identify the hazardous states your system must never enter. Write them down as candidate safety goals even if they are rough. These are your top-level obligations. Everything downstream must trace to them.

Run HARA (or equivalent severity/likelihood/controllability classification) as soon as you have enough operational context. Even a first-pass HARA with partially estimated exposure and controllability ratings gives you an ASIL or integrity level that drives the rigor of subsequent analysis. It is better to start with a defensible estimate and update it than to wait for complete information that will never arrive on schedule.

Use FMEA and FTA in parallel with architectural definition, not after it. The value of FMEA is highest when it can still influence the design. An FMEA run after the architecture is frozen is mostly documentation. An FMEA run during architecture definition can eliminate single-point failures before they become embedded.

Reserve HAZOP for interfaces, operational sequences, and data flows. HAZOP is expensive and slow when applied to everything. It pays the most when applied to the system interfaces and operational modes where FMEA is weakest — the places where failure arises from interaction and context rather than component fault.

Set explicit re-analysis triggers. Define in your safety plan the conditions that require the hazard analysis to be revisited: architectural changes above a defined scope, new use cases, new failure mode evidence from field data, significant changes to operational context. Do not rely on engineering judgment alone to determine when re-analysis is needed.

Maintain traceability as a live artifact, not a release artifact. The safety case is not valid at baseline and then questionable until the next release. If the requirements platform cannot show you current traceability against the current hazard analysis, the gap between your documentation and your engineering reality is already growing.

The Honest Summary

Writing safety requirements for a system whose failure modes you don’t fully know is not a problem that better process alone solves. Uncertainty is the actual condition. The discipline is in building a requirements structure that is honest about what is known, traceable to the current hazard analysis, and designed to accept updates without losing coherence.

The methods — PHA, HARA, FMEA, FTA, HAZOP — are not a sequence to complete and file. They are an ongoing investigation. Each one surfaces failure modes that drive requirements. The requirements are only as good as the analysis behind them, and the analysis is never complete until the system is retired.

What you can control is whether the connection between analysis and requirements is visible, maintained, and auditable. That is the operational meaning of a coherent safety argument.

How Do You Write Requirements for a System That Needs to Fail Safely When You Don't Yet Know All the Ways It Can Fail?

Key Takeaways