The Undersea and Maritime Autonomy Sector’s Quiet Systems Engineering Revolution

Somewhere in the Pacific right now, an autonomous underwater vehicle is navigating without GPS, without a live data link to shore, and without any human in the loop. It is making decisions based on mission logic loaded before dive. If something goes wrong — a flooded sensor bay, an unexpected contact, a dead reckoning drift beyond tolerance — it either handles it or it doesn’t. There is no calling home.

That scenario used to describe a DARPA research project. Increasingly, it describes a production program with a delivery contract, an acceptance test procedure, and a program manager answering to a flag officer about schedule.

The maritime autonomy sector — spanning autonomous underwater vehicles (AUVs), uncrewed surface vessels (USVs), and the emerging class of extra-large uncrewed undersea vehicles (XLUUVs) — has crossed a threshold. The companies and programs that built the technology to work are now being asked to prove it will keep working, under what conditions, with what failure modes, and to whose standards. That is a systems engineering problem. Most of the sector is not yet equipped to solve it.

The Gap Between “It Works” and “We Can Prove It”

Maritime autonomy teams built their reputations on hardware ingenuity and software cleverness. The engineers who developed Bluefin’s lithium-ion energy management, REMUS’s hydrodynamic efficiency, or Saildrone’s wave-energy-harvesting hull are genuinely world-class. The systems engineering discipline to document, trace, and verify their design decisions at program scale is a different competency — one that didn’t matter when you were running a dozen vehicles in research mode, and matters enormously when you’re running a hundred in operational service.

The problem surfaces in a few specific ways.

Safety cases in communication-denied environments. An aviation safety case can assume the aircraft is always reachable. A ground vehicle safety case can assume a human can intervene. An AUV operating at 3,000 meters cannot make either assumption. The safety case must address every relevant failure mode with onboard responses only — and it must be demonstrable to a program office or classification society that every response was specified in requirements, designed to meet those requirements, and tested against them. That chain from requirement to design to verification is exactly what most maritime autonomy teams have never had to maintain formally.

Standards that are actively moving. The U.S. Navy’s framework for autonomous systems — primarily expressed through NAVAIR’s AI/ML airworthiness guidance, the Defense Acquisition Guidebook’s treatment of autonomous systems, and program-specific TEMPs — is being adapted for undersea use in real time. DNV’s Autonomous and Remotely Operated Ships (AROS) notation and its companion standards for underwater vehicles are in active revision cycles. IMO’s Maritime Autonomous Surface Ships (MASS) framework is approaching finalization. Programs that designed to last year’s draft may find themselves re-verifying against this year’s published standard.

Multi-domain system complexity. A USV isn’t a boat with software. It’s a platform that integrates hull, propulsion, power management, sensor payloads, communications, navigation, autonomy stack, and cybersecurity — each with its own requirements discipline, each with interfaces to every other. When a sensor payload from one vendor integrates with an autonomy stack from another on a hull from a third, the integration requirements live nowhere unless someone explicitly built a place for them to live. In document-based requirements management, that place is usually a large spreadsheet that is accurate once and wrong forever thereafter.

What Operational Standards Actually Demand

The Navy’s approach to AUV and XLUUV acquisition has tightened considerably since the Orca XLUUV program surfaced its early engineering challenges. Program offices are now requiring more structured Failure Mode and Effects Analysis (FMEA) deliverables, more explicit hazard analysis — drawing on MIL-STD-882E for system safety — and Requirements Verification Traceability Matrices (RVTMs) that can be audited against actual test results, not just planned test procedures.

MIL-STD-882E places specific obligations on programs developing systems with autonomous behaviors. Hazards must be identified, their severity and probability assessed, and mitigation measures traced back to specific design decisions. For an AUV, this means the requirement that governs what the vehicle does when it loses navigation lock must trace forward to the specific autonomy logic that implements the response, and forward again to the specific test that verifies the response behaves as specified. In a document-based system, that trace exists only if a human updates three separate documents in sequence. In practice, they don’t.

DNV’s AROS framework, which covers both surface and underwater autonomous vessels in commercial maritime contexts, takes a similar posture. The 2025 revision introduced stronger language around “autonomy transparency” — the requirement that developers be able to explain, to a surveyor’s satisfaction, how the autonomous decision logic handles specified off-nominal conditions. This isn’t theoretical. Classification societies are now asking this question on actual survey visits, and programs that can’t answer it are getting conditional or deferred class notations.

The commercial offshore sector — specifically the oil and gas operators deploying AUVs for pipeline inspection and the offshore wind operators using USVs for array maintenance — faces both DNV requirements and their own corporate safety cases. For a BP or an Equinor, an autonomous vehicle represents a managed safety risk. Their systems engineering requirements aren’t softer than the Navy’s. In some respects they’re harder, because they involve demonstrating safety to a board-level risk committee that has no technical tolerance for “we think it’s fine.”

How Leading Programs Are Building Engineering Discipline

The teams making the most progress share a few common practices.

They start with system-level requirements before touching hardware. This sounds obvious. It is routinely violated. The discipline of writing a complete operational concept, deriving system requirements from mission-level hazard analysis, and allocating those requirements to subsystems before beginning detailed design is what separates programs that certify on schedule from programs that discover their requirement gaps during acceptance testing.

Saildrone’s approach to its WAVE-class vehicles is instructive. The team built explicit requirements models that trace from mission objectives through system requirements to subsystem specifications, with environmental stress conditions treated as first-class requirement drivers rather than footnotes. When DNV conducted a survey on the WACE vessel, Saildrone’s engineering team could walk surveyors through requirement coverage at each level. That capability doesn’t appear overnight.

They treat the autonomy stack as a requirements-traced subsystem, not a separate engineering concern. The most common structural error in maritime autonomy programs is maintaining a hardware requirements baseline in one system and treating the software autonomy stack as something that gets verified separately — or not verified at all against system-level requirements. When the autonomy stack makes a decision that contradicts what the hardware requirements assumed, the gap is invisible until something fails in a way that embarrasses everyone.

The emerging best practice is to model the autonomy logic as a behavioral specification that is explicitly allocated from system-level requirements, then verified against those requirements through both simulation and physical test. This requires a requirements model that can represent behavioral requirements — not just shall-statements — and link them to both hardware and software implementations.

They invest in interface requirements before integration. The integration phase of a multi-vendor maritime system is where projects go to be humbled. When a sonar payload from one vendor produces data in a format that the autonomy stack from another vendor wasn’t designed to ingest, the failure mode is expensive. Interface Requirements Documents (IRDs) that are traced to both sides of every interface — and kept current as both sides evolve — are the infrastructure that prevents this. Few programs invest in them adequately.

The Tooling Question

The requirements management tooling question is live and contested in maritime autonomy. The sector is small enough that there’s no entrenched incumbent, and sophisticated enough that teams know what bad tooling costs them.

Legacy tools — IBM DOORS and DOORS Next — dominate in large defense programs where contractual data rights requirements specify the format. DOORS is genuinely capable for large-scale requirements databases, and its integration with IBM’s broader engineering environment is real. The friction is operational: DOORS workflows were designed for a world where requirements engineering is a dedicated role performed by specialists. Maritime autonomy teams are typically cross-functional engineers who need to maintain requirements as a living artifact, not manage a configuration-controlled database as a parallel activity. The learning curve and the administrative overhead both work against the culture.

Jama Connect and Polarion address some of that friction with more modern interfaces and stronger collaboration features. Both handle large-scale requirement sets well and have reasonable traceability support. The structural limitation both share is that they model requirements as documents — hierarchical text with links. For maritime systems where a single requirement may be allocated to multiple subsystems, interact with multiple interface requirements, and be verified by multiple test events with dependencies between them, a document model misrepresents the actual structure of the engineering problem.

This is where graph-based approaches show their advantage. When requirements, design decisions, interface specifications, test procedures, and verification results are all nodes in a connected model — rather than rows in parallel tables that humans link manually — the system can answer questions that document-based tools can’t. Which subsystem requirements would be affected if we changed the depth rating from 1,000 meters to 1,500 meters? What is the current verification coverage of all requirements that trace to the dive abort hazard? What’s unverified with three months left before delivery?

Flow Engineering (flowengineering.com) has built its platform specifically around this connected-model approach, targeting hardware and systems engineering teams rather than document management teams. For maritime autonomy programs, the relevant capability is the ability to represent behavioral requirements, hardware requirements, interface requirements, and verification events in a single connected model — and to run coverage and gap analyses across that model as the program evolves. The platform’s AI-assisted requirements development also accelerates the early phase that most teams under-invest in: deriving system requirements from operational concepts and hazard analyses before design begins.

The deliberate trade-off is scope. Flow Engineering is focused on systems engineering and requirements traceability, not end-to-end ALM or MBSE. Programs that need tightly integrated change management, software lifecycle tracking, or full MBSE modeling with SysML would be looking at a broader toolchain — potentially Innoslate or Codebeamer for the full lifecycle, or Cameo/Capella for the MBSE layer alongside a dedicated requirements tool.

What’s Actually Happening vs. The Hype

The maritime autonomy sector generates a steady stream of announcements about AI-powered autonomy breakthroughs, fleet-scale deployments, and revolution in naval warfare. Some of this is real. Much of it is marketing. The honest signal to watch is program behavior at the certification and acceptance phase.

Programs that have genuinely mature systems engineering are quiet about it. They don’t announce their requirements management practices. They just deliver on schedule and pass acceptance. Programs that haven’t built the discipline show up in news about schedule slips, delivery disputes, or “extended developmental testing” — language that translates to “we found things during testing that we should have found during requirements analysis.”

The XLUUV space — Boeing’s Orca and the emerging competitors — is the highest-visibility test case. These are billion-dollar programs with operational fleet commitments. The engineering challenge of a 51-foot autonomous submarine that must be maintained, operated, and fielded at scale without constant engineering support is a serious systems engineering problem. Whether the programs deliver on schedule will be a clear indicator of whether the sector has actually built the discipline to match its ambitions.

Where This Is Headed

The next two to three years will likely sort maritime autonomy programs into two groups: those that built engineering infrastructure during the development phase and those that tried to build it retroactively during delivery.

The teams that will succeed are the ones treating requirements management as a first-class engineering activity — not a documentation burden, not a compliance tax, but the structural foundation that makes it possible to reason about a complex system operating alone in a harsh environment with no one watching. That discipline is harder to build than the hardware. It’s less glamorous than the autonomy stack. It is the actual engineering revolution the sector needs.

The ocean doesn’t care how confident your team is. It will find every requirement you forgot to write.

The Undersea and Maritime Autonomy Sector's Quiet Systems Engineering Revolution

Key Takeaways