Sensor fusion is evolving fast, with early and AI-driven low-level approaches opening new possibilities for robust perception in ADAS and autonomous driving. Ahead of AutoSens Europe 2025, we spoke with Aniss Ouyeder (Continental/AUMOVIO), Pier Paolo Porta (Ambarella), and Sergio Fernandez (Valeo) to hear their perspectives on the advantages, challenges, and future directions of fusion technologies. Read the full blog below.
What is early fusion and why is it advantageous?
Early fusion is when the data from two or more sensors are fused together before other processing; in other words, before getting to a higher abstraction level. Low level fusion benefits from the access to raw data which can eventually be better combined for feature extraction and interpretation and as noted by Ouyeder, one of the biggest advantages to this approach is that it provides richer environmental data.
However, as Fernandez commented, difficulties can be faced to extract specific HW related properties from each sensor, which may affect the quality of features and learning.
What are the advantages for next-generation ADAS and autonomous systems in deploying AI-driven stacks for low-level sensor fusion?
Continental/AUMOVIO and Ambarella’s view is that AI-based low level sensor fusion delivers great advantages by combining together the information from multimodal sensing suites and from multi field-of-view (FoV) contributions. In particular, this AI fusion increases the robustness of the detections and the tracking across multiple FoVs around the vehicle, resulting in greater environmental perception.
Fernandez also pointed out that AI driven stacks provide in any case (both on low-level and on feature-level fusion) an easier adaptability and HW abstraction compared to rule-based methods. Whereas role based methods require a high level understanding of the physical properties of each sensor, AI driven stacks rely on the quality of training dataset and ability to extract proper learned features from it. A hybrid approach can in many cases best combine the extraction of well modelled features (i.e. based on physical sensor properties) with perception related features.
How can robustness against different environmental, roadway and traffic conditions be achieved for Level 3 and beyond?
When going to an L3 system, a proper sensing suite is needed to cover perception all around the vehicle. The most common sensing suites include cameras and radars, and in some cases also lidar. From an architectural point of view, redundancy is needed to cover the period of time when the driver is required to take over, to ensure safe vehicle operation during that transition from AV operation to driver operation. Porta, emphasised that to increase robustness, especially on AI based systems, a proper amount of data is also a crucial factor; both in terms of quantity and variety.
Fernandez agreed that there is no way around it without the combination of different sensor technologies! LiDAR, Radar and camera provide different performance ODDs which shall be combined, either on feature level or low level, to ensure robustness against different environments and scenarios. On top, it is crucial to ensure real-time processing aka. suitability of the perception stack to the SOC.
How can raw sensor data be centrally processed across diverse sensing modalities?
It is key to provide a modular architecture which can be easily fitted to different sensing modalities and still keep in all configuration an optimal suitability on the running SOC. In Valeo’s proposed VaPP architecture we put the focus on real-time processing, and early extraction of point-cloud features and abstraction to higher processing level, which are then independent on the sensor specificities.
Continental/AUMOVIO and Ambarella explained that there is definitely the need to have an efficient processing unit to enable low-level fusion, in terms of interfaces, acceleration engines, and the SW toolchain. Each sensor has specific operations that need to be accelerated by the central processor, in order to efficiently generate the data that can be then fused together.
Moreover, deep learning technology is also evolving over time. To address those advancements, the central processor needs to be equipped to support the latest algorithm technologies, such as Transformer networks.
Is a low, mid or high fusion approach preferred for ADAS applications, or does this vary depending upon the specific application and intended use of the ADAS function?
Naturally, it depends on many factors like ODDs and sensor specificity. VaPP focuses on point-cloud sensors (LiDAR and Radar) and proposes a modular mid-feature level fusion as the best compromise between HW physical properties and fusion raw data enrichment. The results of Radar-LiDAR VaPP perception could be combined with camera data on low-level fusion, or applied on a parallel path on raw data fusion. As highlighted before, the main goal besides performance is to ensure robustness and this can be secured with parallel perception paths.
Porta commented that low-level fusion always provides advantages to the system as it uses a neural network to determine the useful information to be extracted from each sensor. On the other hand, this fusion method becomes more relevant as the level of autonomy increases. Specifically, for L2+ and above, low-level fusion makes a greater difference in achieving the perception level required by the system. High-level fusion (combining processed outputs) is sufficient for basic ADAS functions (e.g., lane-keeping, adaptive cruise control) and allows the reuse of previously developed algorithms or datasets. Mid-level fusion (combining features) offers a balance between complexity and performance for some L2+ systems. In summary, higher autonomy levels benefit most from low-level fusion, while simpler ADAS can rely on high-level fusion for cost and complexity reasons.
You can hear more from Continental/AUMOVIO, Ambarella, and Valeo and more technical experts about low-level sensor fusion and AI perception at AutoSens Europe on 7-9 October.