What Is Fault Tree Analysis (FTA)? Top-Down Safety Reasoning for Complex Systems
When a safety-critical system fails catastrophically, the post-incident investigation almost always reveals the same pattern: not a single failure, but a combination of failures that individually seemed tolerable. Fault Tree Analysis (FTA) is the engineering discipline that finds those combinations before the incident.
FTA is a deductive analytical technique. You start with a specific undesired top-level event—loss of thrust, uncontrolled decompression, reactor SCRAM failure—and work backward, systematically asking: what combinations of failures could cause this? The result is a directed graph, the fault tree, that maps the logical pathways from root causes to catastrophe.
That framing—deductive, top-down, structured—distinguishes FTA from most other safety analysis methods and determines where it fits in a safety program.
The Core Mechanics: Gates, Events, and Logic
A fault tree is not a flowchart or a block diagram. It is a Boolean logic model expressed graphically. Two gate types carry almost all of the analytical weight.
AND Gates
An AND gate means all inputs must occur simultaneously for the output event to occur. In probabilistic terms, AND gates multiply failure probabilities together (assuming independence), which dramatically reduces the probability of the top event. This is the logic that makes redundancy work: if two independent channels each have a failure probability of 10⁻³ per flight hour, an AND gate connecting them yields 10⁻⁶ per flight hour—three orders of magnitude improvement.
AND gates represent architectural safety. When a fault tree is dominated by AND gates near the top, the system has genuine defense in depth.
OR Gates
An OR gate means any single input is sufficient to cause the output event. OR gates add failure probabilities (approximately, for small values), and they represent system fragility. A long OR gate feeding a high-severity top event is a red flag: any one of several failures can independently produce the catastrophe.
Real fault trees contain both gate types in nested combinations. The analysis identifies cut sets—the minimal combinations of basic events that are sufficient to cause the top event. A minimal cut set of size one (a single failure) is a single point of failure. Finding and eliminating single-point failures in catastrophic failure modes is one of FTA’s core practical outputs.
Quantitative and Qualitative FTA
FTA can be run qualitatively—identifying the structure of failure pathways and their cut sets—or quantitatively, where each basic event is assigned a failure rate and the top-event probability is computed. Quantitative FTA requires validated failure rate data, typically from sources like MIL-HDBK-217, FIDES, or component-specific databases. Qualitative FTA is often sufficient for early design phases and for identifying architectural vulnerabilities regardless of precise failure rates.
FTA vs. FMEA: Different Questions, Different Answers
Failure Mode and Effects Analysis (FMEA) is frequently mentioned alongside FTA, and engineers sometimes treat them as alternatives. They are not.
FMEA is inductive and bottom-up. You start at a component, ask “what are its failure modes?”, and trace upward to determine what effects each failure mode has on the system. FMEA is exhaustive at the component level—it surfaces every failure mode of every component and characterizes its severity and detectability.
FTA is deductive and top-down. You start with a specific, named hazard at the system level and ask “what could cause this?” FTA does not try to characterize every failure mode; it focuses analytical effort on the causal pathways to specific undesired outcomes.
This difference has practical consequences:
- FMEA will find failure modes that FTA never considered, including benign ones that simply don’t contribute to catastrophic outcomes.
- FTA will find dangerous combinations of failures that FMEA misses entirely, because FMEA typically analyzes one failure at a time.
- For a system with common-cause failures—a power supply shared by two supposedly redundant channels—FTA makes the dependency visible in the tree structure. FMEA may not, depending on the scope of the analysis.
In a mature safety program, both analyses are performed. FMEA feeds failure rate data and single-failure severity into FTA. FTA provides the system-level context that tells you which FMEA line items actually matter.
When FTA Is Required
FTA is not optional in several regulatory and standards domains.
Aviation: ARP4754A and AC 25.1309
ARP4754A, the SAE guideline for civil aircraft system development, establishes a safety analysis process that explicitly includes FTA as a tool for Functional Hazard Assessment (FHA) follow-up and for demonstrating compliance with quantitative safety objectives. For catastrophic failure conditions, the required probability is less than 10⁻⁹ per flight hour. FTA—combined with Common Cause Analysis—is the accepted method for demonstrating that a system architecture can meet that target.
AC 25.1309, the FAA’s advisory circular on equipment, systems, and installation, similarly prescribes quantitative probability analysis for catastrophic and hazardous failure conditions.
Functional Safety: IEC 61508 and Derived Standards
IEC 61508 (and the domain-specific standards derived from it, including ISO 26262 for automotive and IEC 61511 for process industry) requires Safety Integrity Level (SIL) assessment, which involves both hardware fault tolerance analysis and quantitative reliability modeling. FTA is one of the explicitly named techniques for SIL verification at higher integrity levels. For SIL 3 and SIL 4 functions, regulators expect rigorous deductive analysis, not just FMEA.
ISO 26262 specifically references FTA in its safety analysis requirements for ASIL C and ASIL D items, particularly for establishing the absence of single-point faults and the adequacy of redundancy.
Nuclear: 10 CFR 50 and NUREG Standards
The nuclear industry has the longest history with FTA—the technique was developed at Bell Telephone Laboratories and first applied to the Minuteman missile program in the early 1960s, and the nuclear industry adopted it quickly. The NRC’s Probabilistic Risk Assessment (PRA) methodology, codified in NUREG-0492 and subsequent guidance, uses fault trees as the primary tool for quantifying core damage frequency and large early release frequency. NRC inspections of new plant designs include review of the plant PRA, which requires defensible fault tree models.
What FTA Produces
A completed fault tree analysis delivers several distinct outputs, each with downstream uses.
The fault tree model itself. The logical structure of failure pathways, expressed in AND/OR gate notation, providing a navigable map of system vulnerability.
Minimal cut sets. The complete list of minimal failure combinations sufficient to cause the top event. Cut sets of size one (single-point failures) receive immediate design attention. Cut sets of size two are examined for common-cause vulnerabilities.
Top-event probability. The quantitative estimate of how likely the top event is, given the assigned failure rates of basic events. This value is compared against the regulatory or program-defined safety objective.
Common Cause Analysis results. FTA exposes shared dependencies that undermine independent redundancy—common power, common software, common environment. Common Cause Analysis (CCA) is typically a companion analysis.
Safety requirements. This is the output that gets the least attention in textbooks and the most in practice. Every architectural decision that appears in the fault tree—that redundancy requirement, that independence requirement, that detection and isolation requirement—is a safety requirement. If it is not captured as a traceable requirement, it can be quietly removed during design evolution without anyone realizing the tree has been invalidated.
From FTA Findings to Safety Requirements: The Traceability Problem
Here is the gap that causes safety programs to fail in practice: FTA is performed, cut sets are identified, architectural decisions are made—and then the analysis sits in a PDF that the design team never opens again. The fault tree was built against an early architecture. The architecture changes. The tree is never updated. At certification, the mismatch surfaces and costs months.
The solution is to treat every safety-critical constraint in the fault tree as a formal requirement, assigned an identifier, allocated to a design element, and verified by a defined method.
This is harder than it sounds, because FTA findings do not come out as natural language requirements. A cut set that says “Loss of Channel A AND Loss of Channel B” implies an independence requirement, a redundancy requirement, and possibly a monitoring requirement—but none of those are explicit in the tree. Converting fault tree structure into requirements requires deliberate translation, and those requirements need to be traceable forward to design decisions and backward to the safety objective that motivated them.
How Structured Platforms Handle FTA Traceability
Manual approaches—spreadsheet RTMs, linked Word documents—break down quickly on systems with hundreds of cut sets and thousands of derived requirements. The requirements exist, but their relationship to the underlying safety analysis is opaque, and change management is manual and error-prone.
Modern requirements platforms built on graph-based data models handle this substantially better. Each requirement is a node. The FTA finding that generated it, the design element that implements it, and the verification test that closes it are all connected edges. A change to any node propagates visibly through the graph—you can see, immediately, which requirements are affected when an architecture changes and which test cases need re-evaluation.
Flow Engineering is built on this graph-based model and is designed specifically for hardware and systems engineering programs where FTA and FMEA findings must be first-class traceable artifacts. In Flow Engineering, a safety requirement derived from a fault tree cut set can be linked directly to the specific cut set, the functional hazard assessment entry that established the safety objective, the design specification element that implements the mitigation, and the verification evidence that closes it. That chain—hazard to requirement to design to test—is navigable in both directions.
This bidirectional traceability matters at certification. When a certification authority asks “how do you know your architecture eliminates single-point failures in this failure mode?”, the answer is not “we have a fault tree somewhere”—it is a navigable chain from the safety objective to the architectural requirement to the design decision to the test result.
Flow Engineering also supports requirement status tracking across the program lifecycle, so teams can see which FTA-derived requirements are unallocated, which are allocated but unverified, and which have verification gaps as the program evolves. For programs managing hundreds of hazard-derived requirements across multiple subsystems, that visibility is the difference between a controlled safety case and a compliance scramble at the end.
Practical Starting Points
If your organization is starting an FTA effort or trying to connect an existing FTA to your requirements process, the following steps reflect current practice:
1. Define the top-level events precisely. Vague top events produce vague trees. “Loss of braking” is a defined event. “Braking system problems” is not. Every top event should have a defined severity level from your FHA before the tree is built.
2. Use your architecture model as the tree substrate. FTA that is disconnected from the actual system architecture produces cut sets that do not correspond to real failure pathways. The tree should reflect how the system is actually built, including power distribution, data buses, and software interfaces.
3. Translate cut sets into requirements immediately. Every cut set of size one is a single-point failure requirement. Every cut set of size two involving a shared dependency is a common-cause isolation requirement. Do this translation before the architecture is frozen.
4. Assign every FTA-derived requirement an owner and a verification method. Unowned requirements drift. A requirement without a defined verification method will not be verified.
5. Version the fault tree with the architecture. When the architecture changes, the fault tree must change with it. Treat the fault tree as a living model, not a one-time study.
6. Connect your requirements platform to the safety analysis. If your requirements tool cannot link a requirement to its originating hazard analysis artifact, you have a traceability gap that will cost you at certification.
Honest Assessment
FTA is a powerful analytical technique with a 60-year track record across aviation, nuclear, automotive, and industrial safety. It does things that no bottom-up analysis can do: it finds the dangerous combinations, it quantifies system-level risk, and it makes architectural vulnerabilities visible before they become incidents.
Its limitations are real. FTA is only as good as the model it is built on. A fault tree built against an idealized architecture, maintained in a document separate from the living design, will diverge from reality and provide false assurance. The technique requires disciplined process integration—not just competent safety engineers, but a requirements infrastructure that keeps the FTA findings connected to design and verification throughout the program lifecycle.
The analysis is the beginning of the work, not the end.