What Is SOTIF? A Complete Guide to ISO 21448 and Safety of the Intended Functionality

Why correct behavior can still kill people

In July 2016, a Tesla Model S operating under Autopilot drove into the side of a tractor-trailer at highway speed. Both the camera and radar failed to classify the white trailer body against a bright sky. No sensor malfunctioned. No software crashed. The system functioned exactly as designed—and a man died.

This is the class of accident that ISO 21448, known as SOTIF (Safety of the Intended Functionality), was created to prevent. The standard was published in 2022 after years of technical debate inside ISO TC 22/SC 32, and it addresses a category of risk that the automotive industry’s primary functional safety standard, ISO 26262, was never intended to handle: hazards caused not by system failures but by the system working correctly under conditions its designers did not adequately anticipate or specify.

Understanding SOTIF requires first understanding exactly why the older framework was insufficient.


What ISO 26262 covers—and what it deliberately leaves out

ISO 26262 is a mature, well-understood standard for functional safety in road vehicles. Its core assumption is that dangerous behavior results from something going wrong: a microcontroller bit flip, a stuck relay, a software exception, a sensor that returns no data. The standard provides rigorous methods—FMEA, FTA, FMEDA—for identifying such failure modes and assigning integrity levels (ASILs) that specify how thoroughly those failures must be prevented or detected.

The key word is failure. ISO 26262 defines a safety goal violation as something that happens because a component or system did not perform its intended function. The standard explicitly excludes hazards caused by the correct execution of an intended function that is itself insufficient for the situation.

For a conventional braking system or an electric power steering column, this exclusion is acceptable. The intended functionality of those systems is well-understood, physically bounded, and specified with enough determinism that a competent FMEA can enumerate credible failures.

For a forward collision warning system that must recognize pedestrians in rain, at night, at oblique angles, against complex backgrounds, at varying ranges—the exclusion is a chasm. The system can work perfectly according to its design specification and still fail to detect a child stepping off a curb. No fault occurred. The intended functionality was simply not good enough for that context.

SOTIF closes that gap.


The four-area model: the conceptual heart of ISO 21448

SOTIF introduces a two-dimensional framework that is conceptually simple but practically demanding. It partitions all possible system behavior and operating scenarios into four areas:

Area 1: Known safe behavior. Scenarios where the system performs adequately and this adequacy is verified. This is the validated operating envelope.

Area 2: Known unsafe behavior. Scenarios where the system is known to perform inadequately, causing or contributing to hazards. These are documented and addressed through design improvements, ODD restrictions, or warnings.

Area 3: Unknown unsafe behavior. Scenarios where the system performs inadequately, but the development organization does not yet know this. This is the dangerous region—it is where accidents happen.

Area 4: Unknown safe behavior. Scenarios where the system happens to perform safely, but this has not been verified. Benign but unaudited.

The engineering objective of SOTIF is to reduce Areas 2 and 3 to an acceptable level. Area 2 is addressed by fixing known problems or constraining the ODD. Area 3 is addressed through systematic scenario generation, simulation, and testing designed to surface unknown failure modes before deployment.

The word “acceptable” in that objective carries most of the practical difficulty. SOTIF does not define a specific threshold. It requires that residual risk be evaluated as acceptable using reasoned argument and supporting evidence—a posture that is familiar to anyone who has worked with safety cases under UK DEF STAN 00-056 or the ALARP principle, but that is uncomfortable for engineers trained on the more prescriptive requirements of 26262.


Why SOTIF was specifically created for ADAS and autonomous systems

SOTIF’s primary target domain is Level 1 through Level 5 driving automation, with particular emphasis on perception-dependent ADAS functions: automatic emergency braking, adaptive cruise control, lane keeping assistance, and traffic sign recognition.

These systems share three properties that create SOTIF-class risks:

Context sensitivity. The correct output depends on environmental factors—lighting, weather, road geometry, object classification—that cannot be fully enumerated at design time and that vary continuously during operation.

Statistical performance. Unlike a relay that either closes or does not, a neural-network-based pedestrian detector has a detection rate and a false positive rate that are functions of the input distribution. Specifying this behavior in classical “shall” requirements is genuinely difficult.

Human automation interaction. Drivers using ADAS systems develop mental models of what the system can do, sometimes correctly and sometimes not. A system can be misused in ways that cause accidents even when the system itself operates within its design limits. SOTIF addresses this through analysis of foreseeable misuse, which has no direct analog in ISO 26262.

The standard also introduces the concept of triggering conditions: specific inputs or scenarios that cause a system with no faults to produce inadequate output. Identifying triggering conditions is the analytical core of the SOTIF hazard analysis, and it is substantially harder than traditional FMEA because it requires reasoning about what the system should do, not just what it might fail to do.


The SOTIF process flow

ISO 21448 organizes its process into four main phases, which interact iteratively rather than executing as a strict waterfall:

Phase 1: Specification of intended functionality and understanding of potential hazards. This begins with a function description precise enough to identify what constitutes adequate versus inadequate performance. It includes definition of performance metrics, operational context, and the initial enumeration of hazardous events that inadequate performance could cause.

Phase 2: Evaluation of known hazardous behavior (Area 2). Using the triggering conditions identified in Phase 1, engineers analyze known scenarios where the system underperforms. This involves simulation, hardware-in-the-loop testing, and scenario databases. The output is either design changes, ODD restrictions, or documented acceptance rationale.

Phase 3: Evaluation of unknown hazardous behavior (Area 3). This phase attempts to surface what is not yet known. Methods include coverage-driven simulation with adversarial scenario generation, real-world data collection, and formal techniques such as combinatorial testing. Statistical arguments about scenario coverage are central to demonstrating that Area 3 has been sufficiently reduced.

Phase 4: Evaluation of the complete system in operation. Post-deployment monitoring, field data analysis, and a defined process for incorporating field-discovered triggering conditions back into the development cycle.

These phases are supported by a SOTIF safety case: a structured argument that the residual risk from Areas 2 and 3 is acceptable. The safety case is the artifact regulators and customers examine; the process phases are the evidence-generating machinery that supports it.


The Operational Design Domain: the boundary that makes analysis tractable

A system exposed to all possible driving conditions globally, across all weather, all road types, all traffic configurations, all lighting conditions, presents an effectively unbounded scenario space. SOTIF hazard analysis would be computationally and evidentially intractable without some means of bounding that space.

The Operational Design Domain (ODD) provides that boundary. An ODD precisely specifies the conditions under which a given driving automation feature is designed to operate: geographic region, road type, speed range, weather conditions, time of day, presence of lane markings, and so on. SAE J3016, which defines the levels of driving automation, introduced the ODD concept; SOTIF uses it as the primary input to hazard analysis.

The relationship between ODD and SOTIF is tighter than it might appear. Every parameter in the ODD is a boundary on what triggering conditions are in scope. Exclude nighttime operation from the ODD, and night-specific detection failures move from Area 2 (known unsafe behavior requiring resolution) to outside-ODD operation (which requires separate handling and robust ODD exit mechanisms, but is not subject to the same SOTIF analysis).

This creates a powerful design lever—and a significant responsibility. ODD restrictions that exist primarily to make the safety case tractable, rather than because the system genuinely monitors and exits the ODD correctly, transfer risk to the driver and to the public without eliminating it. SOTIF requires that ODD transitions be handled safely and that the system’s behavior outside the ODD be explicitly considered.


The requirements challenge: specifying statistical and context-sensitive behavior

This is where SOTIF compliance most visibly strains traditional systems engineering practice.

A classical requirement reads: The braking system shall achieve full brake application within 150 ms of receiving a BRAKE command. It is deterministic, testable, and binary: the system either meets it or does not.

A SOTIF-appropriate requirement for a pedestrian automatic emergency braking function might need to express something like: The system shall achieve a detection rate of ≥ 97% for pedestrians crossing at 90 degrees at ranges between 15 and 40 meters under daylight conditions with visibility greater than 200 meters, with a false positive rate of ≤ 0.1% per kilometer. That performance envelope must then be connected to the ODD parameters it depends on, to the hazard scenarios it is intended to prevent, and to the test suite that validates it.

Writing that requirement in a traditional document-based tool is possible. Tracing it to the ODD definition, linking it to the SOTIF hazard analysis, and then connecting it to test scenarios in a way that is auditable for a safety case—in a document-based tool, this becomes a manual cross-reference nightmare that breaks under change.

The challenge compounds when requirements are conditional: Under fog conditions (visibility < 50 m), the system shall issue a take-over request within 2 seconds of detecting degraded sensor performance. That requirement references an ODD boundary condition, a sensor performance metric, and a human-machine interaction specification. A change to any of those three things potentially invalidates the requirement.

This is the structural problem SOTIF creates for requirements engineering, and it is a problem that the tool infrastructure at most automotive suppliers was not designed to handle.


How modern tools support SOTIF requirements work

The structured, interconnected nature of SOTIF requirements—ODD-bounded, scenario-linked, statistically expressed, traceable to safety cases—is exactly the kind of problem that benefits from graph-based requirements management rather than document-based approaches.

Flow Engineering (flowengineering.com) is built around this model. Rather than storing requirements as rows in a document, it represents them as nodes in a connected graph, with explicit relationship types: derived-from, bounded-by, validated-by, conflicts-with. An ODD specification is not a separate document that engineers must manually cross-reference against stakeholder requirements; it is a set of nodes to which performance requirements are directly linked.

For SOTIF work specifically, this architecture supports two things that are otherwise operationally painful. First, it makes ODD-bounded requirements first-class objects: a requirement can carry an explicit ODD scope tag, so that when the ODD is revised—as it will be repeatedly through development—the impact on dependent requirements is immediately visible. Second, its AI-native capabilities can assist in identifying when a proposed requirement is ambiguous about its ODD context or when two requirements may conflict given a specific ODD boundary condition, surfacing these issues before they become safety case problems during audit.

Flow Engineering’s focus is on the front-end structured requirements and systems architecture work. Teams that need to connect SOTIF requirements directly into test management platforms or formal verification tools will need to evaluate integration paths to those downstream systems, which is a deliberate scope boundary rather than an oversight.

For SOTIF compliance specifically, the highest-leverage investment in tooling is usually at the requirements and hazard analysis level—which is where the most expensive rework happens when ODD definitions change or triggering conditions discovered in Phase 3 cascade back into Phase 1.


Practical starting points for SOTIF implementation

For engineering teams beginning SOTIF compliance work, four actions have disproportionate impact:

Define the ODD with engineering precision before beginning hazard analysis. Qualitative ODD descriptions (“highway driving”) are insufficient. Parameters must be bounded, measurable, and linked to sensor performance data. This work often surfaces capability gaps earlier than any other activity.

Establish performance metrics for intended functionality before writing requirements. SOTIF hazard analysis requires knowing what “adequate” means quantitatively. Teams that skip this step find themselves unable to determine whether identified triggering conditions are in Area 2 or Area 3.

Build the scenario database as a living artifact, not a test plan appendix. Triggering conditions discovered in Phase 3 must feed back into Phase 1 requirements. A scenario database that lives in a spreadsheet disconnected from requirements will not support this loop.

Treat the SOTIF safety case as a requirements management problem, not a documentation problem. The safety case is an argument whose premises are requirements, test results, and analysis. Managing that argument requires traceability infrastructure, not word processing.

SOTIF is demanding precisely because it forces engineering teams to be explicit about what their systems cannot do, and rigorous about the conditions under which those limitations are acceptable. That is uncomfortable. It is also, given the safety record of early ADAS deployment, exactly the right discipline.