DVN interview series on “Key players in ADAS/AV and Dual Use”: Interview with NURO’s VP & Head of Autonomy, Tilo Schwarz, about his plan and vision of “Autonomy for all.”

Interview by Dr. Juergen Dickmann, Senior Advisor DVN

Note: At DVN, we aim to keep the ADAS/AV community informed about sensing, architecture and applications. On 17–18 November, our DVN conference will take place in Stuttgart and for the first time, host special sessions on Dual-Use and on “The Road to Type Approval: Mastering End-to-End AI Systems”.

In the runup to the DVN conference, we are presenting important players , and sharing their results, perspectives in this field. The first in this series was our report on a preview drive with Wayve’s latest vehicle; In this episode, we would like to introduce one of the Entrepreneurs in E2E driving Stacks, Nuro.

DVN-Dickmann: Hi Tilo, thank you for taking the time. First question: What does Nuro offer, to whom, and what business model are you aiming for?

T. Schwarz (Nuro): At a high level, our strategy is to build an autonomous “driver” that abstracts away the vehicle platform. Our earlier deployments with delivery robots showed us that if you can solve driving for a small vehicle in complex urban environments, the core driving intelligence can be generalized. The goal is similar to a human driver: you carry the driving skill with you and can adapt it quickly to a different car. Technically, that means separating the generic driving stack from the vehicle-specific interface layer.

Strategically, we think in three verticals.

First is mobility, specifically robotaxi. Our collaboration with Uber and Lucid illustrates this: Lucid provides the vehicle, Uber provides the demand platform and operations, and we provide the autonomous system (sensor set, compute, software) integrated into the base vehicle.

Second is delivery and logistics. The same core system can be adapted to drive delivery trucks, postal vans, or other goods vehicles. The operating context is very similar: dense cities, interactions with other road users, and the need to stop, park, and maneuver close to curbs and addresses. The key lesson was that deep vertical integration can lock your software to a specific platform and make cross-application reuse hard, so we focus on portability.

Third is what I would call “advanced driver assistance / autonomy for owned vehicles”, spanning sophisticated hands-free Level 2 use cases up to Level 3 and Level 4, depending on the operational design domain (ODD). Again, the backbone should stay the same; the adaptation layer should be small.

In practice, this approach is working. In the last year we brought up two new vehicle platforms from scratch: take an OEM vehicle, integrate our system, and reach autonomous driving in a relatively short time. The interface to the vehicle can be kept compact: steering, braking, acceleration, plus a set of status and safety signals. Driving has limited degrees of freedom, so you can generalize the intelligence and adapt it through a well-defined interface.

DVN-Dickmann: So do you sell “just” the software stack, or an ecosystem like Nvidia with tooling and the full development chain?

T. Schwarz (Nuro): It depends on the customer. Some want an end-to-end, buy-like solution; others want deeper co-development. In general, we provide (1) the autonomous driving stack itself and (2) the tools and AI infrastructure that enable development, training, and validation. We can flex our engagement model depending on how much the customer wants to shape the product and integrate it into their organization.

DVN-Dickmann: Who are the typical customers for this, and how do you think about commercialization?

T. Schwarz (Nuro): We see interest from mobility operators, OEMs, and logistics players. Commercialization is not only an algorithm story: you need robust integration, maintainability, and an operating concept for issues in the field. That is why partnerships matter—vehicle makers and platforms often bring production and fleet operations strengths, while we focus on the autonomous system plus the data, tooling, and safety processes behind it.

DVN-Dickmann: Is your system hardware/SOC agnostic, and does that also apply to the sensor setup?

T. Schwarz (Nuro): Our intent is to be flexible on both compute and sensors, because different customers have different preferences and existing ecosystems. For example, for the Uber deployment we rely heavily on Nvidia’s platform. In that setup, the lower layers—device drivers and platform components below the OS—are on the Nvidia side, while everything above is in-house at Nuro.

Longer term, we try not to couple ourselves too deeply to any single SOC. Being able to run on more than one compute platform reduces risk and expands the customer base.

On sensors, the same principle applies, within the constraints of the ODD. Historically, Nuro built sensors in-house, including radar and lidar, because early on the market didn’t offer what we needed at automotive quality. With the new strategy and the desire to integrate across ecosystems, we switched to off-the-shelf, OEM-grade sensors from major Tier 1 suppliers. That gives customers flexibility in sourcing and placement, as long as the sensors meet the requirements of the target ODD.

It is not arbitrary: if you want higher speeds on highways, you need sufficient long-range performance; if your use case involves close-range interaction with pedestrians and cyclists, you need strong near-field coverage and often multiple modalities. But given a set of specifications, we can be flexible.

A related design criterion is to avoid overfitting perception to a single sensor’s quirks. Over our company’s life we have used different sensor suites, and we train our perception models across those variations. The goal is that the AI learns the scene, not the sensor artifacts (color mapping, shutter effects, and similar).

DVN-Dickmann: But the sensor set has a strong influence on the ODD, and therefore on how well your stack works. Is that correct?

T. Schwarz (Nuro): Yes. Sensor capability and ODD are coupled. If you choose an ODD with high speed, long-range sensing becomes a must. If you choose an ODD with complex, close interactions, near-field and occlusion handling become critical. Our view is: define the ODD and its safety goals first, then derive the sensor requirements. Within those requirements, we can remain sensor-flexible.

DVN-Dickmann: Since Nvidia’s announcements at CES 2026, more players and big OEM partnerships are entering the space. Forecasts still show small unit volumes for a long time. Is this becoming a replacement race where only one to three global groups survive?

T. Schwarz (Nuro): Beyond 2030 the picture gets blurry. Up to 2030, I expect coexistence rather than immediate consolidation, largely because companies target different needs and operating models. The “point A to point B in a city” robotaxi use case is clear and will have multiple viable providers. Other use cases—longer distance travel, personal ownership preferences, and broader geographic coverage—can favor OEM-led solutions.

Also, even in markets where products are broadly similar, consolidation is not guaranteed. In the US alone there are many vehicle manufacturers selling functionally comparable cars, and they still make money. So I don’t think we should assume a near-term collapse to a tiny number of winners. What I do expect is technological convergence in some foundational capabilities: better perception, planning, validation tooling, and more standardized architectures.

DVN-Dickmann: Let’s talk about validation of AI driving stacks. Many say the bottleneck is not technique but safety case, regulation, and type approval. Some propose validating one “black box” with another black box, using generative AI to create long-tail scenarios and virtual kilometers. Does that generate enough confidence?

T. Schwarz (Nuro): Safety is the top priority; you can’t compromise there. Simulation and generative techniques are useful, but they cannot be the whole argument by themselves. We think about validation as a portfolio of methods, not a single magic lever.

First, on sensors and safety: even if someone can build a vision-only system that is better than an average human in many ways, that doesn’t automatically mean it is the best final architecture for a Level 4 system with no driver. Additional modalities can add safety potential, especially under adverse weather or edge cases. Our default approach is multi-modality—camera, radar, lidar—unless we can prove a modality is not contributing to safety for a specific ODD.

Second, on validation structure: we define the ODD precisely and decompose it into maneuver/context classes. Even “urban driving” contains many distinct situations (turning, merging, lane changes, yielding, emergency interactions). In on-road testing, the system should not encounter “new categories” of situations—only new instances—because the taxonomy should be complete.

Third, we derive requirements from both behavior expectations and compliance requirements (for example, state DMV guidance in the US, local traffic rules in different regions, and analogous requirements in other countries). Those requirements need evidence, and the evidence should come from multiple test modes.

Practically, we use a matrix of test modes. The environment can be real road, closed test track, or simulation. The system under test can be the real production-intent stack on a real vehicle, a virtual vehicle in simulation, or simplified setups for specific subsystems. Different combinations are better for different questions. Rare and high-severity events (such as near-crashes or accident-like sequences) are difficult and unsafe to collect at scale through normal driving. For those, you need curated datasets, scenario reconstruction, and simulation. Other behaviors (like routine lane changes) are abundant and can be validated with large-scale recorded miles and regression testing.

We also use fault injection. Some faults are easiest to test in simulation (sensor dropouts, timing jitter, extreme parameter sweeps). Others are valuable to test on track where you can inject a real sensor or vehicle fault in a controlled environment and observe the vehicle’s response.

Finally, we do not rely only on end-to-end testing. We validate the full system behavior, but we also validate the modules: perception, localization, control, and safety monitors. For example, we test perception performance on specific objects under rain at certain distances, or measure control stability under defined conditions. This is one reason we prefer a hybrid architecture rather than a purely end-to-end black box. Certain interactions are easier to learn end-to-end—like interpreting a police officer waving—because hand-coding every possible gesture is not feasible. But we still output interpretable objects, classes, and tracks so we can measure performance and enforce safety constraints.

DVN-Dickmann: Combining sensor configuration and redundancy: With central processing and raw data fusion, could multiple radar and camera modalities provide the statistically independent evidence that used to require “two out of three” sensor types? Could that enable dropping lidar, especially considering the tremendous integration and line-of-sight costs?

T. Schwarz (Nuro): It is a valid question, and we look at both directions. Radar is improving toward imaging capability: higher channel counts and better angular and elevation resolution. Imaging radars can be particularly attractive for forward sensing. At the same time, lidar cost is coming down and solid-state lidar is becoming more practical, with different scan patterns. If your processing is robust, different scan patterns are manageable.

From our standpoint as a software-stack provider, the “race is open.” If lidar prices drop enough, adding more lidar units might be an easy way to increase redundancy and coverage. Alternatively, radar could evolve further—possibly even into higher frequencies—to approach lidar-like resolution, while keeping radar’s strengths such as Doppler. We are not at a final answer today, but our perception stack is designed to remain flexible so we can choose the best sensor combination for a given use case and cost-performance target.

DVN-Dickmann: What is Nuro’s stance on “vision-only” versus multi-sensor autonomy?

T. Schwarz (Nuro): I don’t want to give a definitive, universal statement because the right answer depends on the ODD and a detailed safety analysis. The practical way to approach it is: before removing an entire modality, evaluate which specific sensors you can remove, reposition, or replace while maintaining coverage and safety goals. We do ODD coverage simulation that shows which sensors see what, where, given vehicle geometry. We then aim for the minimum sensor set that can reasonably cover the scenarios.

Once you have a larger fleet and substantial real-world data, you learn which scenarios dominate, which are truly hard, and which sensors contribute most to safety-critical cases. With that evidence, you can make more grounded decisions about whether a modality is necessary or whether there is an alternative path to the same safety outcome.

DVN-Dickmann: Thank you, Tilo. That was a fast dive into complex topics. We appreciate Nuro’s perspective and hope to see you at the conference.