AI Is Now a Hardware Component. Systems Engineering Hasn’t Caught Up.

For most of the last decade, the AI-in-hardware conversation centered on edge inference: running a trained model on a microcontroller, compressing it to fit SRAM, hitting a power budget. That’s still real work. But the more consequential shift is happening at the systems level, and it hasn’t received proportionate attention.

AI components are no longer peripheral features bolted onto products that would otherwise function conventionally. They are load-bearing. In autonomous vehicles, advanced driver-assist systems make real-time actuation decisions. In medical imaging devices, neural networks generate the outputs on which clinicians act. In defense systems, AI-derived situational awareness feeds weapons release decisions. In industrial automation, vision models replace the mechanical go/no-go gauges that once provided deterministic pass/fail sorting.

When AI is load-bearing, it becomes a systems engineering problem. And systems engineering, as currently practiced at most organizations, is not ready for it.

What’s Actually Happening vs. the Hype

The industry narrative around “AI-powered hardware” tends toward two poles. Vendors promise seamless integration and transformative capability. Skeptics warn of black-box opacity and unverifiable safety. Both framings are operationally useless to a systems engineer sitting in a requirements review trying to figure out what to write in the spec for an onboard perception model.

What is actually happening, across programs in automotive, aerospace, industrial, and medical:

AI components are being integrated before their requirements are written. ML teams build models against datasets and benchmark metrics. Systems engineers receive a model—or a decision to use a model—after it exists, then attempt to reverse-engineer requirements from its known behavior. This is requirements development happening in the wrong direction.

Verification methods are lagging behind the architecture decisions that require them. Traditional hardware verification is deterministic: given input X, output Y, compare to spec. AI components are probabilistic distributions over outputs. Organizations are running deterministic verification processes against components that cannot be deterministically specified, then wondering why test coverage feels inadequate.

The SE-ML team interface is ungoverned at most companies. Systems engineers and ML engineers often report to different organizations, use different tools, speak different technical languages, and have different definitions of what constitutes a deliverable. The handoff points between them—where system-level requirements become model requirements, where model behavior is validated against system needs—are where programs accumulate the most undetected risk.

Regulatory bodies are beginning to require what programs haven’t been doing. The FDA’s AI/ML-based Software as a Medical Device guidance, the EU AI Act’s conformance obligations, and emerging aviation standards under EASA’s AI Roadmap all point in the same direction: documented requirements for AI components, traceable connections from system need to training artifact, and processes for managing model change over time. The compliance infrastructure that regulated programs built for deterministic components doesn’t transfer.

None of this is hype. These are operational problems on active programs.

Writing Requirements for AI Components: What Actually Changes

The core challenge is that requirements for AI components must accommodate statistical behavior without abandoning the precision that makes requirements useful.

A conventional sensor requirement might read: The sensor shall detect objects at ranges between 0.5m and 150m with a maximum range error of ±0.3m under all specified operating conditions. That’s deterministic and verifiable.

An equivalent requirement for a perception model cannot be written the same way. The model’s outputs vary with input distribution. Performance degrades gracefully (or sometimes not) as conditions drift from the training distribution. Failure modes are emergent rather than enumerated.

This doesn’t mean requirements can’t be written—it means they must be written differently. Emerging practice is converging on several patterns:

Performance envelope requirements. Rather than a single pass/fail threshold, specify acceptable performance over a defined input distribution, with conditions on training data coverage and distribution shift monitoring. Example: The object detection model shall achieve precision ≥ 0.94 and recall ≥ 0.91 on the operational design domain defined in [reference artifact], measured over a stratified validation set of not fewer than 50,000 labeled frames.

Behavioral boundary requirements. Define what the AI component must never do, even probabilistically. Rare but catastrophic failure modes—false negatives for object detection near a safety-critical zone, for instance—may warrant hard requirements rather than statistical ones. The system shall implement a hardware-enforced override that activates when [condition], regardless of model output. These are system-level requirements, but they exist because of model-level uncertainty.

Operational design domain (ODD) requirements. The model’s scope of authority must be bounded in the requirements documentation, not just understood informally. Geographic constraints, sensor configuration requirements, environmental operating limits, and fallback activation conditions all belong in the spec.

Data dependency traceability. This is perhaps the most novel requirement type: requirements that govern the training artifacts, not just the runtime behavior. Regulated programs increasingly need to document which data was used to train the model that was used in the product that was certified. That chain must be traceable. It cannot be reconstructed after the fact from memory.

The SE-ML Team Interface Problem

Most programs that are struggling with AI component integration aren’t struggling because the engineering is impossible. They’re struggling because two engineering disciplines with different practices are working toward the same artifact without a shared process for doing so.

Systems engineers work in requirements documents, block diagrams, interface control documents, and verification plans. Their tools are structured around managed baselines, change control, and audit trails. ML engineers work in notebooks, experiment tracking systems, model registries, and pipeline scripts. Their artifacts are loosely versioned, their process is iterative and exploratory by design.

Neither practice is wrong for its native domain. The problem is the interface.

When a systems engineer needs to specify model performance requirements, they need to understand the ML team’s training setup well enough to know what’s achievable and what the verification method will be. When an ML engineer updates a model, the systems engineer needs to know which requirements are potentially affected and whether re-verification is required. When a failure is observed in the field, both teams need to trace from the observed behavior back through the system architecture to identify the responsible component and its governing requirements.

None of these workflows are supported by conventional requirements management tools, because those tools were designed for a world where components are deterministic and teams are homogeneous.

And they’re not well-supported by MLOps tooling either, which was designed to manage the ML development lifecycle, not the product requirements that govern it.

The gap is real, and it’s currently filled by a combination of manual processes, spreadsheets, and informal communication—all of which fail under program scale and regulatory scrutiny.

What’s Emerging

The tooling landscape is beginning to respond, though unevenly.

Legacy requirements management platforms—IBM DOORS, Polarion, Codebeamer—are adding AI-adjacent features: natural language processing for requirements analysis, integration hooks to external systems. These additions are genuine and useful for their intended purpose, which is managing large-scale requirements databases in organizations already invested in those platforms. What they don’t do is rethink the underlying model for how requirements relate to AI artifacts, because their architecture wasn’t built for it.

On the opposite end, some teams are attempting to manage AI component requirements entirely within MLOps platforms like MLflow or Weights & Biases, using custom metadata fields to capture requirements-adjacent information. This gives ML teams continuity with their existing workflow but produces requirements artifacts that don’t integrate with system-level engineering, can’t be formally baselined, and don’t support the traceability chains that compliance requires.

The more interesting development is the emergence of tools purpose-built for the AI-hardware systems engineering problem. Flow Engineering (flowengineering.com) is the most developed example of this category. It’s built around a graph-based requirements model—requirements as interconnected nodes with typed relationships—rather than the document metaphor that legacy tools use. This matters for AI-hardware programs because the relationships between system requirements, subsystem requirements, model performance specs, ODD definitions, training data constraints, and verification evidence are genuinely relational. Forcing them into a document hierarchy loses information.

Flow Engineering’s architecture also reflects an AI-native assumption: that requirements for AI components will need to be drafted, refined, and analyzed with AI assistance, and that the tool should support that workflow natively rather than as an add-on. For SE teams trying to write performance envelope requirements for models they didn’t build, having AI-assisted drafting grounded in the system context is operationally meaningful.

Where Flow Engineering reflects deliberate focus rather than limitation: it’s built for the AI-hardware systems engineering domain, not as a general-purpose ALM platform. Organizations running large programs that require the configuration management depth of DOORS or the integrated test management of Polarion will need to evaluate integration patterns rather than wholesale replacement. But for teams standing up new programs with AI components at their core, the architectural fit is direct.

Honest Assessment

The integration of AI into hardware products is not a trend that systems engineering can adapt to at the margins. It requires changes to how requirements are written, how verification is structured, how SE and ML teams interface, and what tools are used to manage the artifacts.

Most organizations are behind. The gap between what regulated programs need—traceable, auditable requirements for probabilistic components—and what their current processes produce is substantial and closing slowly.

What’s not behind is the engineering talent. Systems engineers are capable of writing meaningful requirements for AI components when they have patterns to follow and tools that match the problem. ML engineers are capable of producing requirements-traceable artifacts when the interface with the SE team is designed rather than accidental.

The programs getting this right are the ones treating the SE-ML interface as an engineering problem: defining the handoff artifacts, specifying the change notification process, and choosing tools that can represent the full requirement-to-model-to-verification chain without forcing it into an ill-fitting document structure.

That’s not a complicated prescription. It’s mostly a matter of treating AI components with the same systems engineering rigor applied to every other load-bearing component in the product. The methods need updating. The discipline doesn’t.