Nuro: Navigating Autonomous Vehicle Regulation as a Last-Mile Delivery Pioneer

How the first federally exempted self-driving vehicle reshaped how the industry thinks about safety cases for machines that carry cargo, not people

When the National Highway Traffic Safety Administration granted Nuro its first-ever federal exemption from Federal Motor Vehicle Safety Standards in February 2020, the decision was widely covered as a business story — a startup winning regulatory permission to operate commercially. That framing missed the engineering significance entirely.

The FMVSS exemption was not a permission slip. It was a forced articulation of a new class of safety argument: one that says a vehicle can be demonstrably safer without the structural protections required for human-occupied vehicles, provided the system’s behavioral and operational boundaries are designed correctly. That argument had never been made before at the federal level, and making it required Nuro’s engineering and regulatory teams to construct a safety case from first principles. The result is one of the more instructive examples in recent autonomous systems history of what requirements engineering looks like when there is no prior art.

The Regulatory Starting Point: What FMVSS Actually Requires

FMVSS is a body of standards written for vehicles that carry people. The standards mandate occupant crash protection (airbags, seat belts, structural crush zones), field of vision requirements for human drivers, steering column collapse specifications, and dozens of other provisions whose design rationale is a human body sitting inside the vehicle.

Nuro’s R1 and R2 vehicles were designed to carry groceries, not people. No seats. No steering wheel. No occupant compartment. This meant that literal compliance with FMVSS was either impossible (some standards require components that physically cannot exist in a vehicle with no interior) or nonsensical (a crush zone protecting an empty cargo bay from the perspective of human safety provides no benefit).

Nuro petitioned for exemption under the safety rationale that the vehicle’s operational constraints — it operates at low speeds, in geofenced areas, with no human occupants — created a materially different risk profile than a passenger vehicle. The safety argument was not “we are exempt from safety”; it was “the specific safety properties that matter for this vehicle are different from those that FMVSS encodes.”

NHTSA granted a two-year exemption for up to 5,000 vehicles. The grant letter was careful to note that the exemption was contingent on Nuro’s ongoing reporting obligations and that it did not constitute endorsement of Nuro’s safety approach. That caveat matters: regulatory permission and a validated safety case are not the same thing.

Reframing the Safety Problem: From Structure to Behavior

Traditional vehicle safety engineering is substantially a structural discipline. FMVSS reflects this: crashworthiness, occupant kinematics, restraint systems, and intrusion resistance are all properties of physical materials and geometry. You can validate them with physical tests. FNCAP ratings exist because you can crash a car into a wall and measure the outcome.

For Nuro, the dominant safety questions are behavioral. The vehicle will interact with pedestrians, cyclists, and other vehicles. It will encounter novel road conditions, edge-case sensor scenarios, and unexpected actor behavior. When something goes wrong, the failure mode is overwhelmingly likely to be a perception error, a decision error, or an edge case outside the operational design domain — not a structural failure of the vehicle body.

This shifts the load-bearing center of the safety case from ISO 26262 (functional safety, which addresses systematic and random hardware failures) to ISO 21448, better known as SOTIF — Safety Of The Intended Functionality. SOTIF was designed precisely for this class of problem: hazards that arise not because something failed, but because the system is doing exactly what it was designed to do in a situation it was not designed to handle.

SOTIF analysis requires the engineering team to enumerate the triggering conditions that could lead the system to behave in an unsafe way, even when operating within its nominal parameters. For a perception system, this means asking: under what conditions does the sensor stack produce outputs that are technically within spec but semantically wrong? A lidar return that correctly reflects a flat surface but fails to resolve a low-contrast obstacle at a specific distance, under specific lighting. A camera classification model that correctly identifies most pedestrians but performs below threshold on children in unusual clothing.

These are not easy questions to write requirements for. They require behavioral specifications — descriptions of what the system must do, not merely what it must be made of.

The Operational Design Domain as a Requirements Boundary

Nuro’s safety architecture depends heavily on what the autonomous systems industry calls the Operational Design Domain, or ODD: the specific conditions within which the system is designed to function. For Nuro, the ODD is tightly constrained. Low speed (under 25 mph on the R2). Defined geographic areas. Specific weather exclusion conditions. Daylight and limited nighttime operation windows.

The ODD is not merely a marketing claim about where the vehicle operates. It is a requirements boundary. Every behavioral requirement in Nuro’s safety case is implicitly or explicitly conditioned on ODD membership. A requirement like “the vehicle shall yield to pedestrians entering the roadway” has a hidden precondition: the vehicle must be operating within its defined ODD for the sensing and decision systems that implement that behavior to be validated as reliable.

This creates a systems engineering challenge that is easy to underestimate: ODD boundary detection is itself a safety-critical function. The vehicle must be able to recognize when it is approaching the boundary of its operational envelope and initiate a safe stop before it operates outside validated conditions. This requires requirements for weather sensing, road condition assessment, GPS integrity monitoring, and behavioral fallback states — none of which appear in FMVSS, because FMVSS assumes a human driver will make those judgments.

Writing those requirements is not straightforward. The ODD boundary is not always a crisp Boolean condition. Fog density varies continuously. Road conditions degrade gradually. The engineering team must decide where on these continua the system crosses from “within ODD” to “outside ODD,” and those decisions must be traceable to a safety argument that a regulator can evaluate.

SOTIF in Practice: The Behavioral Requirements Challenge

The specific difficulty SOTIF introduces for requirements engineering is that it demands specifications at a level of behavioral granularity that traditional hardware requirements processes were not designed to handle.

Consider a conventional automotive requirement: “The brake system shall bring the vehicle from 60 mph to 0 in no more than 150 feet under dry conditions.” This is testable. It has a measurable acceptance criterion. You run the test, you get a number, you compare it to the threshold.

A SOTIF-derived behavioral requirement might read: “The perception system shall detect and classify stationary obstacles of cross-section greater than 0.1 m² at a distance of no less than 30 meters under all valid ODD lighting and weather conditions.” This is also testable in principle — but the test space is enormous. “All valid ODD lighting conditions” spans a continuous range. Validation requires statistical arguments, scenario-based testing, and simulation coverage in addition to physical test runs.

More challenging still are the requirements for edge case recognition and behavioral degradation. “The system shall recognize when its perception confidence is below threshold and initiate safe-stop behavior” requires defining what “perception confidence below threshold” means operationally — a definition that requires both a technical metric and a safety argument linking that metric to actual risk reduction.

Nuro’s engineering teams have had to develop requirements disciplines that bridge hardware specification, behavioral specification, and statistical validation. This is not a solved problem in the industry. It is an area where the tooling, the processes, and frankly the regulatory frameworks are still catching up to the engineering reality.

What’s Actually Happening vs. What the Hype Suggests

The autonomous vehicle industry has spent a decade promising commercial-scale deployment timelines that were wrong by factors of three to five. Nuro has not been immune to this. The company scaled back operations and reduced headcount significantly in 2023, reflecting both the genuine difficulty of the technical problem and the capital intensity of operating a physical fleet.

The honest assessment is that Nuro’s engineering and regulatory work is real and substantive, but the path from “first federal exemption” to “profitable last-mile delivery network” involves a compounding series of systems engineering problems that each require the kind of first-principles work the FMVSS exemption demanded.

The regulatory environment has evolved but not resolved. NHTSA has published Automated Driving Systems guidance and the AV STEP program, but the United States still lacks a comprehensive federal framework for ADS certification. This means companies like Nuro are simultaneously doing engineering work and regulatory theory — constructing safety arguments in a framework that does not yet have agreed standards for what constitutes sufficient evidence.

That is not a criticism of Nuro specifically. It reflects the genuine state of the field. The industry is ahead of the standards bodies, and the standards bodies are ahead of the regulators. Everyone is working with incomplete maps.

Practical Implications for Systems Engineers

What does Nuro’s experience tell practicing systems engineers about AV development more broadly?

Behavioral requirements need first-class treatment. The instinct in hardware engineering is to decompose system requirements into component specifications as quickly as possible. For ADS development, behavioral requirements at the system level must be maintained, traced, and validated independently — not dissolved into component specs that lose the safety-relevant context.

The ODD is a systems engineering artifact, not a product decision. Where your system operates is a requirements boundary condition, not a market positioning choice. It needs to be specified with the same rigor as any other system boundary, and changes to it require full safety re-analysis.

SOTIF analysis generates requirements, not just tests. Many teams treat SOTIF as a V&V activity — something you do after the design is complete to check for edge cases. The more productive approach is to use SOTIF analysis upstream, during requirements development, to identify the behavioral specifications that the system must meet in order to bound the scenario space to a manageable risk level.

Traceability from safety argument to requirement to implementation is non-negotiable. When your safety case is behavioral rather than structural, you cannot validate it with a single physical test. The argument is distributed across thousands of requirements, test results, and simulation runs. Without traceable links between the safety argument and the evidence, the safety case is not a case — it is a collection of documents.

Modern requirements tools that support graph-based traceability and AI-assisted specification are becoming meaningful here. Platforms like Flow Engineering, designed specifically for hardware and systems engineering teams working on complex multi-domain problems, are beginning to address the challenge of maintaining coherent traceability across large behavioral requirement sets — connecting safety goals to system functions to verification evidence in ways that legacy document-based tools struggle to support. The behavioral requirements challenge that Nuro exemplifies is exactly the problem domain where connected, model-aware tooling earns its cost.

Honest Assessment

Nuro’s regulatory achievement in 2020 was genuine. Constructing a novel safety case, persuading a federal regulator to accept it, and establishing precedent for an entire class of vehicles was real engineering and policy work, not a shortcut.

The harder truth is that a federal exemption is an entry ticket, not a finish line. The systems engineering required to actually deploy a safe autonomous delivery vehicle at scale — to write the behavioral requirements, bound the operational design domain, validate SOTIF scenarios, and maintain traceability across a system of this complexity — is ongoing, expensive, and unsolved at the industry level.

Nuro’s story is worth studying not because they have figured it out, but because they were first forced to confront directly what “figuring it out” actually requires. Every autonomous systems team developing a vehicle, robot, or aircraft with a safety case that depends on behavioral rather than structural arguments is working in the territory Nuro entered first. The map is still being drawn.

Nuro: Navigating Autonomous Vehicle Regulation as a Last-Mile Delivery Pioneer

Key Takeaways

Nuro: Navigating Autonomous Vehicle Regulation as a Last-Mile Delivery Pioneer

How the first federally exempted self-driving vehicle reshaped how the industry thinks about safety cases for machines that carry cargo, not people

The Regulatory Starting Point: What FMVSS Actually Requires

Reframing the Safety Problem: From Structure to Behavior

The Operational Design Domain as a Requirements Boundary

SOTIF in Practice: The Behavioral Requirements Challenge

What’s Actually Happening vs. What the Hype Suggests

Practical Implications for Systems Engineers

Honest Assessment