What Is SOTIF? ISO 21448 Explained for Systems Engineers

A camera-based pedestrian detection system works exactly as its engineers designed it. No hardware fault. No software bug. The ASIL decomposition is clean. ISO 26262 compliance is documented. Then, at dusk, in light rain, with a pedestrian in a dark jacket against a concrete median, the system fails to detect. No fault triggered. No diagnostic code. The system was just not good enough for that scenario.

This is the problem SOTIF was created to address.

Safety of the Intended Functionality (SOTIF), defined in ISO 21448, is the safety standard concerned with hazards arising from the insufficient performance of intended functionality—not from hardware random failures or systematic faults, but from the functional limits of a system that is operating nominally. Published by ISO in 2022 after years of development, it fills a gap that ISO 26262 was never designed to close.


The Gap ISO 26262 Doesn’t Fill

ISO 26262 is a mature, well-understood functional safety standard for road vehicles. It defines a process for managing risks from hardware failures and systematic software errors. Its core mechanism—ASIL decomposition, diagnostic coverage, safety mechanisms—targets faults: the system deviates from its intended behavior.

SOTIF targets something different. It asks: what if the system does exactly what it is intended to do, but the intended behavior is insufficient to prevent harm in some scenarios?

The distinction matters more than it sounds. A forward collision warning system that never triggers a spurious alert and always responds correctly to its sensor inputs is ISO 26262-clean. But if its sensor fusion algorithm systematically underperforms in low-angle sun glare and fails to detect a stationary vehicle, you have a SOTIF violation—a hazardous scenario caused by the limits of intended functionality, not by a fault.

For legacy automotive features—cruise control, electric power steering, automatic headlights—the gap between “works as intended” and “safe in all scenarios” is narrow and manageable. For perception-heavy systems powered by machine learning, computer vision, and sensor fusion, that gap can be enormous and poorly bounded.


Core Concepts in ISO 21448

The Four-Zone Model

SOTIF organizes scenarios into four zones based on what is known and whether the scenario is safe:

  • Zone 1 — Known unsafe scenarios: The team knows the scenario exists, knows it is hazardous, and has not yet addressed it. These are the known gaps on day one of SOTIF analysis.
  • Zone 2 — Unknown unsafe scenarios: Hazardous scenarios that have not yet been identified. The most dangerous zone—you cannot mitigate what you cannot see.
  • Zone 3 — Known safe scenarios: Scenarios that have been analyzed and confirmed safe, either through design margins or validation evidence.
  • Zone 4 — Unknown safe scenarios: Scenarios not yet identified but which are, in fact, safe. These become Zone 3 as coverage expands.

The SOTIF process is fundamentally a campaign to shrink Zones 1 and 2. Zone 1 shrinks through design changes, operating condition restrictions, and verification evidence. Zone 2 shrinks through systematic scenario exploration—the harder problem.

Triggering Conditions and Hazardous Behavior

SOTIF introduces two key concepts that structure its hazard analysis:

Triggering conditions are specific situations in the operational environment that expose a functional insufficiency. Low-sun angle. Radar clutter from overhead gantries. Lane markings obscured by standing water. Triggering conditions are not failures—they are real-world situations that reveal the limits of a perception or control system.

Functional insufficiencies are the behavioral limitations that, when activated by a triggering condition, lead to hazardous behavior. A classifier that degrades below an acceptable performance threshold under a specific lighting condition is a functional insufficiency.

The SOTIF process requires teams to enumerate triggering conditions systematically, map them to functional insufficiencies, and assess whether the resulting behavior is hazardous and how likely that hazard is to occur in the intended operational domain.

The Intended Operational Domain

SOTIF analysis is always bounded by the Operational Design Domain (ODD)—the environmental, geographic, and situational conditions under which a feature is intended to operate. A Level 2 highway assist system designed for dry, marked roads in daylight has a different SOTIF profile than a Level 4 urban robotaxi operating around the clock in all weather.

This is not a loophole. The ODD does not eliminate SOTIF obligations—it scopes them. If a vehicle encounters conditions outside its ODD without adequate transition mechanisms (warnings, driver handback requests, graceful degradation), the ODD restriction itself becomes a SOTIF issue.


The SOTIF Process in Practice

ISO 21448 defines a process that spans the development lifecycle, but its most distinctive phases involve systematic scenario analysis and validation.

Hazard Analysis for Functional Insufficiencies

Before any V&V activity, teams must identify functional insufficiencies through a structured analysis—typically extending existing HARA (Hazard Analysis and Risk Assessment) from ISO 26262 to include performance-based hazards. This means asking not only “what happens if this function fails?” but “what happens if this function works but performs at the boundary of acceptability?”

For AI-based perception systems, this includes analyzing training data coverage, known distribution shifts, corner cases in labeling, and architectural limitations that produce systematic errors rather than random ones.

Verification and Validation Strategy

SOTIF V&V is not reducible to pass/fail functional tests. The standard calls for a multi-method approach:

Scenario-based simulation: Parameterized scenario spaces are explored systematically—varying lighting, weather, road geometry, object characteristics, sensor placement—to identify triggering conditions and characterize functional insufficiency boundaries.

Structured field testing: Targeted real-world test coverage of scenarios identified as high-risk through simulation and analysis. Not random miles. Planned coverage of Zone 1 scenarios.

Statistical argumentation: Because Zone 2 cannot be fully enumerated, SOTIF safety cases often require statistical arguments about residual risk—quantitative claims that the remaining unknown unsafe scenarios pose acceptably low risk given a defined ODD and exposure estimate.

Coverage metrics: Unlike traditional software testing, where 100% branch coverage is a meaningful target, SOTIF requires coverage of scenario space—which is continuous, multi-dimensional, and never truly exhaustible. Coverage argumentation is a core competency teams need to develop.

Complementary Relationship with ISO 26262

The two standards are designed to work together, not compete. ISO 26262 handles:

  • Random hardware failures (semiconductor faults, connector degradation)
  • Systematic software errors introduced during development
  • Fault detection and fault tolerance mechanisms

SOTIF handles:

  • Insufficient performance under nominal operation
  • Hazards from triggering conditions in the environment
  • Edge case coverage and residual risk from bounded ODD

A complete safety argument for an ADAS feature or autonomous driving function requires both standards to be applied. Gaps in either create safety arguments with holes that regulators, certification bodies, and incident investigators will find.


Why SOTIF Is Especially Hard for AI Systems

The SOTIF problem existed before machine learning—a traditional rule-based lane-keeping algorithm still had performance boundaries. But AI-based systems amplify every SOTIF challenge.

Non-determinism and distribution shift. Neural networks trained on a finite dataset generalize imperfectly. The triggering conditions that cause performance degradation are often not enumerable from first principles—they emerge from the gap between training distribution and deployment distribution. A model that performs at 99.5% mAP on a benchmark may underperform significantly on scenarios not represented in its training data.

Opaque failure modes. Rule-based systems fail in ways that engineers can usually trace. Neural networks fail in ways that can be structurally invisible—the system produces a confident, plausible-looking output that is wrong in a safety-relevant way, with no internal signal that anything unusual happened.

Difficulty of scenario space coverage. The number of distinct triggering conditions for a visual perception system running in urban environments is astronomically large. Systematic coverage requires parameterized simulation, adversarial testing, synthetic data generation, and statistical sampling strategies that most teams are still developing.

Requirement specification incompleteness. You cannot write an exhaustive requirements specification for a machine learning model’s performance across all possible inputs. SOTIF does not eliminate this problem, but it does force teams to be explicit about what they know and do not know.


Managing SOTIF Requirements and Scenario Traceability

SOTIF compliance produces a specific kind of engineering artifact burden: large numbers of scenarios, each linked to functional insufficiencies, triggering conditions, risk assessments, V&V methods, and residual risk arguments. The web of traceability between these artifacts is not optional—it is the safety case.

Managing this in a document-based tool or a spreadsheet is a category error. You end up with a static RTM that captures a snapshot of a system that is constantly changing. Scenarios added during simulation campaigns need to trace back to requirements. Design changes need to propagate to affected hazard assessments. New triggering conditions need to be evaluated against existing V&V coverage.

This is where platforms built on graph-based, model-centric architectures have a meaningful structural advantage.

Flow Engineering (flowengineering.com) is designed specifically for hardware and systems engineering teams managing this kind of interconnected requirements and traceability problem. Its graph-based data model makes it natural to represent SOTIF artifacts—functional insufficiencies, triggering conditions, scenarios, hazard assessments, V&V results—as nodes with typed relationships rather than rows in a table or paragraphs in a document.

For SOTIF specifically, this means teams can trace a specific scenario (e.g., pedestrian at dusk in light rain) directly to the functional insufficiency it triggers (perception threshold degradation under low-contrast conditions), to the hazard it creates (failure to initiate emergency braking), to the ASIL classification, to the V&V method used to address it, and to the test evidence that closes the loop. When a simulation campaign identifies a new class of triggering condition, engineers can assess which existing requirements and test cases are affected—across the whole system—without manually auditing document sets.

Flow Engineering’s AI-native design means teams can also work interactively with scenario generation and coverage gap analysis in ways that document-based tools like IBM DOORS or Jama Connect were not built to support. Those tools have genuine strengths in regulated workflow management and change control; they were built for requirements in stable, well-specified systems. SOTIF’s scenario space is neither stable nor fully enumerable—which is precisely where a more dynamic, connected data model earns its place.


Practical Starting Points

If your team is beginning SOTIF work or formalizing an existing informal process, three priorities matter most:

1. Define your ODD precisely and treat it as a living artifact. Vague ODD definitions produce vague SOTIF arguments. The ODD should be a structured, versioned specification—geographic constraints, speed ranges, weather conditions, time of day, road types—that is linked to your hazard analysis.

2. Invest in scenario taxonomy early. Before you can manage Zone 2, you need a structured way to categorize scenarios by triggering condition type, functional domain, and risk level. This taxonomy becomes the skeleton of your coverage argument.

3. Build traceability infrastructure before you need it. The cost of retrofitting traceability into a mature SOTIF program is high. Start with a tool architecture that treats scenarios, requirements, and V&V evidence as connected artifacts—not separate documents.


Honest Assessment

SOTIF is a hard standard to apply rigorously. The core challenge—managing risk from scenarios you have not yet imagined—does not yield to checklists. The standard itself acknowledges this: the concept of residual risk from unknown unsafe scenarios requires teams to make probabilistic safety arguments under irreducible uncertainty.

But the alternative—shipping perception-dependent ADAS and autonomous functions without a systematic process for bounding unknown unsafe scenarios—is not an engineering option. It is a liability position.

ISO 21448 gives teams a framework. Applying it with rigor requires the right tooling, the right V&V strategy, and the discipline to treat scenario coverage as an engineering deliverable with the same status as code coverage or hardware reliability metrics.

The engineers who internalize that discipline early will write better safety cases, ship safer systems, and spend less time in post-incident reconstruction trying to explain why a scenario nobody anticipated turned out to matter.