What SOTIF Is, and What It Is Not
ISO 21448, commonly called SOTIF — Safety of the Intended Functionality — was published in 2022 to address a class of hazard that ISO 26262 was never designed to handle. ISO 26262 covers functional safety: what happens when a component fails, behaves randomly, or deviates from its specification. The entire framework assumes a fault has occurred and asks whether the system detects, tolerates, or mitigates it.
SOTIF asks a different question: what if nothing fails?
Consider a camera-based automatic emergency braking system. All components operate within specification. The sensor data is transmitted without corruption. The perception algorithm runs correctly on the input it receives. And yet — in certain lighting conditions, at certain approach angles, with certain target profiles — the system fails to detect a pedestrian and does not brake. No fault occurred. The system did exactly what it was designed to do. The hazard arose from the performance boundary of the intended function.
This is the domain SOTIF governs. It applies wherever a system’s correct operation can still produce harm because the system’s sensors, algorithms, or actuators have performance limits that can be exceeded by real-world operating conditions. Autonomous driving and advanced driver assistance systems (ADAS) are the primary application domain, though the standard’s principles extend to any safety-relevant system with sensor-dependent decision logic.
Understanding this distinction is not academic. The two standards require different engineering activities, different verification strategies, and produce different artifacts. A team applying only ISO 26262 to an ADAS feature with machine-learning perception components is not demonstrating safety — it is demonstrating fault tolerance. Those are not the same thing.
The Four-Quadrant Model: Known, Unknown, Safe, Unsafe
SOTIF structures the engineering problem around four categories of scenarios, organized by two axes: whether the scenario is known or unknown to the development team, and whether it produces safe or unsafe system behavior.
Known safe scenarios are the situations you have analyzed, tested, and confirmed the system handles correctly. This is the space you are trying to maximize.
Known unsafe scenarios are situations you have identified where the system’s performance is insufficient — triggering conditions you know produce hazardous behavior. SOTIF requires that these be mitigated through design changes, operational constraints, or user interface interventions before release.
Unknown safe scenarios are situations you have not explicitly analyzed, but where system behavior happens to be acceptable. These exist and are not a problem in themselves, though they represent gaps in coverage confidence.
Unknown unsafe scenarios are the core risk. These are situations where the system will behave hazardously, and you do not know it yet. SOTIF’s primary engineering obligation is to systematically search for and collapse this space — to convert unknown-unsafe scenarios into known-unsafe scenarios (which can then be mitigated) or to demonstrate through structured argumentation that the residual unknown-unsafe space is acceptably small.
This framing has a concrete implication for validation strategy: you cannot simply demonstrate that the system passes a fixed test suite. You must demonstrate that you have conducted a credible, systematic search for hazardous scenarios you did not initially anticipate. That requires structured scenario generation methods — not just running the scenarios you thought of first.
ODD Definition: The Engineering Foundation
The Operational Design Domain (ODD) is the set of operating conditions within which a system is designed to function safely. SOTIF makes ODD definition a foundational engineering activity because every triggering condition analysis is bounded by it.
An ODD is not a product description. “Works in urban environments” is not an ODD. A rigorous ODD specifies the environmental conditions (illumination range, precipitation type and intensity, road surface markings, weather), traffic conditions (vehicle density, relative velocity ranges, cut-in scenarios), infrastructure conditions (lane markings present/absent, signage types), and geographic constraints within which the system’s performance claims are valid.
The practical consequence: anything outside the ODD is not a SOTIF problem — it is an ODD boundary problem. The system is not intended to function there, and the engineering obligation shifts to ensuring the system recognizes it has exited its ODD and responds appropriately (typically by requesting driver takeover or imposing operating constraints).
Poorly specified ODDs generate cascading problems throughout the safety case. If the ODD boundary is vague, triggering condition analysis has no clear scope. If triggering conditions have no clear scope, the argument that you have covered the unsafe scenario space becomes unauditable. This is where many ADAS programs run into certification difficulty — not because the system performs badly, but because the safety argument cannot be reconstructed from the engineering artifacts.
A practical starting point: draft your ODD as a structured list of attributes with explicit ranges and boolean conditions, not prose paragraphs. Every attribute should be traceable to a corresponding sensor performance specification and, eventually, to validation data.
Triggering Condition Analysis
Within the ODD, triggering conditions are the specific combinations of circumstances that cause the system to behave unsafely. A triggering condition is not a fault — it is a situation where the system’s performance is insufficient for the operating environment.
SOTIF requires teams to identify triggering conditions systematically. The standard does not prescribe a single method, but the engineering community has converged on several techniques used in combination:
Functional insufficiency analysis examines each system function and asks: under what conditions does this function fail to produce safe output even when operating correctly? For perception functions, this includes sensor-specific degradation modes — lens flare patterns, radar ghost targets from metal structures, lidar performance in heavy rain — combined with scenario characteristics like target size, closing speed, and occlusion geometry.
Hazard and risk analysis (adapted from ISO 26262’s HARA methodology) identifies the hazardous events that result from functional insufficiency and assesses their severity, controllability, and exposure. This produces a risk classification that prioritizes which triggering conditions require the most aggressive mitigation or coverage.
Scenario-based analysis uses structured scenario taxonomies — often derived from databases like ASAM OpenSCENARIO or regulatory reference scenarios — to systematically generate candidate triggering conditions. The goal is to be more systematic than engineering intuition alone, which reliably undersamples rare-but-credible conditions.
Each identified triggering condition must be linked to: the functional insufficiency it exploits, the hazardous event it can produce, the mitigation measures applied (design changes, ODD restrictions, driver monitoring requirements), and the verification evidence demonstrating the mitigation is effective.
That last chain — triggering condition to mitigation to evidence — is where the SOTIF safety case either holds together or falls apart. If those links are maintained in disconnected spreadsheets, word documents, and test management systems that do not reference each other, the safety case is effectively unarguable at audit. Reconstructing the chain manually under time pressure produces exactly the kind of gaps that regulatory reviewers flag.
Validation Coverage: Demonstrating Sufficiency
The hardest question in SOTIF is: how do you know when you are done?
ISO 26262 has a relatively tractable answer: achieve the required diagnostic coverage metrics, demonstrate fault injection results, close the safety mechanism verification matrix. SOTIF’s answer is less algorithmic because the unknown-unsafe space has no defined boundary to close against.
The standard requires that validation demonstrate two things: first, that the known-unsafe scenario space has been addressed through design or operational measures; second, that the residual unknown-unsafe space is acceptably small, supported by a credible argument that the scenario search was systematic and thorough.
In practice, validation coverage for SOTIF involves:
Simulation-based scenario coverage using tools that can generate parametric variations of identified triggering conditions across the full ODD envelope. A single nominal test scenario is insufficient — teams need coverage across the parameter space of each triggering condition (approach velocity range, illumination levels, target size distributions) to support a coverage argument.
Real-world data collection targeted at triggering conditions that simulation cannot faithfully reproduce — particularly sensor-specific degradation modes where physics fidelity in simulation is uncertain.
Statistical validation approaches, often referenced as confidence interval arguments, that link accumulated test exposure to residual risk claims. These are technically demanding and the subject of ongoing standardization work, but regulators in key markets are beginning to require them for higher-level autonomy claims.
Coverage metrics that track the proportion of identified triggering conditions with closed verification evidence, and the proportion of the ODD parameter space with positive safety evidence. These metrics need to be live engineering instruments, not end-of-project summaries.
How Modern Tools Structure SOTIF Coverage
The volume and interconnection of SOTIF artifacts — ODD attributes, triggering conditions, hazardous events, mitigations, verification evidence, coverage metrics — exceed what document-based requirements management tools handle well. The core problem is that SOTIF demands a graph of relationships, not a hierarchy of requirements.
IBM DOORS and DOORS Next were designed around document structure with link overlays. They can store SOTIF artifacts, but navigating the triggering-condition-to-evidence chain requires manual link traversal that does not scale to large scenario libraries. Jama Connect handles traceability better and has reasonable review workflow support, but its data model is still fundamentally item-and-link rather than a native graph. Polarion ALM offers more configurability and integrates with test management, which helps close the evidence loop, but configuration complexity is high and AI-assisted scenario generation is not a native capability.
Flow Engineering was built around a graph model rather than a document model, which maps naturally to the SOTIF scenario coverage problem. Teams can model ODD attributes as nodes, attach triggering conditions as derived nodes, link mitigations and verification activities directly, and query coverage gaps across the full connected structure. Rather than asking “which requirements lack a test,” the query becomes “which triggering conditions lack closed verification evidence, and which ODD parameter ranges do those conditions span” — a structurally different and more useful question.
Flow Engineering also brings AI-assisted analysis to scenario generation. Teams can describe a functional insufficiency and use the tool to surface candidate triggering conditions against a structured scenario taxonomy, then review and accept or reject them. This does not replace engineering judgment, but it systematically reduces the chance that a credible triggering condition class goes unexamined — directly attacking the unknown-unsafe space.
The traceability architecture matters at audit. When a certification authority asks for the safety argument supporting a specific ODD boundary claim, the answer should be a query result, not a document search. Flow Engineering’s connected model makes that argument navigable rather than reconstructed.
One deliberate scope choice worth naming: Flow Engineering is specialized for requirements and systems engineering workflows. It does not replace simulation platforms, test management systems, or data collection pipelines. Teams need to integrate it with those tools through defined interfaces, which requires some integration design work upfront.
Practical Starting Points
For a team beginning a SOTIF program, the order of operations matters:
-
Define the ODD first, precisely. Every subsequent activity is scoped by it. A vague ODD means vague coverage arguments.
-
Build the triggering condition library incrementally. Start with functional insufficiency analysis driven by your sensor technology’s known degradation modes. Supplement with scenario databases. Do not wait for the library to be complete before linking it to the system architecture.
-
Establish the evidence chain early. The connection from triggering condition to mitigation to verification evidence needs to be the structural backbone of your safety case, not a retrospective assembly. Tools that support this natively reduce rework.
-
Instrument coverage metrics as engineering artifacts. Percentage of triggering conditions with closed evidence is a sprint metric, not a release gate metric. Teams that track it continuously surface gaps earlier and with more time to address them.
-
Plan for iteration. SOTIF is not a waterfall process. ODD refinements, new triggering conditions discovered in testing, and mitigation design changes will propagate through the scenario model. A tool that makes those updates traceable and auditable is worth the investment.
SOTIF is harder than ISO 26262 in one specific way: it requires demonstrating the sufficiency of a search, not just the completeness of a response. That is an argument about engineering process as much as engineering results. The teams that do it well treat scenario coverage as a living model — not a document they write once and file.