Pointing at AIGC's in-vehicle application computing power, why can this company win the "world's first car-level rating"?

The AIGC in-vehicle application represented by ChatGPT has promoted the intelligentization of automobiles into a new trend.

Not long ago, Microsoft and Mercedes-Benz announced that they will cooperate to integrate the ChatGPT artificial intelligence service into the car, providing a smarter and more personalized voice assistant experience for owners of models equipped with the MBUX interactive system.

Geely Automobile also revealed that at present, the company already has a full-stack self-developed AI large-scale model technology, which not only covers functions such as painting, music, and language, but also has the capability of autonomous driving AI large-scale models, which can realize billion-level, tens-billion-level, and 1000-level AI models. Full coverage of billion-level data large models.

However, the realization of AI large models requires powerful computing power to support the training and reasoning process, so AIGC vehicle applications put forward higher computing power requirements for end-side reasoning chips.

Superimposed Moore's Law is slowing down day by day, even close to failure, AIGC vehicle applications are facing particularly severe computing power challenges. According to industry statistics, from 2012 to 2018, the training AI computing power has increased by 300,000 times, but the general computing power of Moore's Law has only increased by 7 times. It can be seen that Moore's Law has been difficult to meet AIGC's demand for computing power.

"In the post-Moore era, the biggest challenge in the design of automotive high-computing chips is how to find a flexible and scalable chip architecture with comprehensive heterogeneous integrated computing power, as well as high-performance, low-latency, and high-reliability interconnection methods. Chiplet and GPNPU are The best technical path.” said Zhang Hongyu, founder and CEO of Xinli Intelligent .

It can be seen that, starting from the core NPU performance of the AI ​​chip, the NPU architecture similar to the GPU can simultaneously satisfy high-speed calculation and cover the data access delay, and the high-speed bus interconnect Chiplet can reduce the multi-card inference delay. This is also the core technical advantage of Xinli Intelligent, that is, the multi-core reasoning that enables high-speed bus interconnection behaves like a large chip in actual reasoning.

As the world's first company to use Chiplet technology to develop on-board computing power chips, Xinli Intelligent was officially established in 2021. Aiming at the special application requirements of on-board chips, Chiplet not only created the exclusive car-grade Chiplet Die-to-Die interconnection technology, but also successfully applied for the original Die-to-Die interconnection IP patent in many countries.

At present, Chiplet’s automotive-grade Chiplet Die-to-Die interconnection IP has been taped out. According to his introduction, it is expected that the first car-grade Chiplet SoC tape-out will be realized by the end of this year, and mass production will begin in 2025.

In the in-vehicle chip market where there are many giants and the Chiplet technology ecosystem, why does Xinli Intelligent take the lead in winning the entry ticket for car-level Chiplet chips with large computing power? The right time, place and people may be the best answer.

AIGC Vehicle Application Computing Disaster

In the process of "electrification" moving towards "intelligence", the intelligent attributes of new energy vehicles have not been fully released, and AIGC has high hopes for intelligent vehicles.

On the one hand, the AIGC technology represented by ChatGPT can realize the anthropomorphic and emotionalization of the smart cockpit, or it will make up for the shortcomings of automobile intelligence and endow it with smarter and richer cockpit interactions.

For example, the active and multi-modal in-vehicle interactive experience such as voice assistant transformed into a proactive real-life AI virtual co-pilot is breaking through the traditional passive interaction based on wake-up words and becoming the next arena of the smart cockpit. The realization of these functions requires Rely on powerful chip computing power support.

On the other hand, in order to improve the perception ability of intelligent driving, large models represented by BEV+Transformer can extract feature vectors, perform feature fusion in a unified 3D coordinate system space, and combine timing information for dynamic recognition, and finally perform multi-tasking Outputs, such as static semantic maps, dynamic detection, etc.

However, since the AIGC large model and the vehicle perception capabilities of intelligent driving are based on the Transformer architecture, device-side deployment needs to be optimized for Transformer. For the Transformer computing bottleneck caused by GEMM optimization and the memory access bottleneck of Self-attention, enough AI computing power is still needed to support the reasoning and deployment of AI models.

It can be seen that the rapid development of automobile intelligence has driven the computing power demand of on-board chips (smart driving, smart cockpit) to develop in a blowout manner. As AIGC drives the in-depth development of automobile intelligence, the loss of computing power is becoming more and more obvious. If the contradiction between the increase in computing power demand of AI models and Moore's Law cannot be resolved, the application and popularization of large models will inevitably be inhibited.

Chiplet technology is considered to be one of the best solutions for the semiconductor industry in the post-Moore era for chips with large computing power.

It is reported that Chiplet technology can perfectly solve the problem of performance expansion through small chips "building blocks" . The latter requires high performance and low cost, while also meeting high security and reliability.

Arguably, automotive chips have higher and more complex requirements than the server market or any other consumer market. To meet the diverse needs of automotive chips in the post-Moore era, it is necessary to break through the bottleneck of advanced technology through innovative design.

Complying with the "time" of automobile intelligence and aiming at the demand for chips with large computing power in vehicles, Xinli Intelligent has adopted a series of original designs in the implementation and practice of Chiplet technology, mainly including chip architecture and inter-chip interconnection ( D2D ) . aspect.

For example, in terms of chip architecture, Chiplet SoC of Chiplet Intelligence neither adopts a performance-limited mobile computing architecture nor a high-performance computing (HPC) architecture, but creates an embedded high-performance computing platform based on heterogeneous integration (eHPC) to meet the demand for high-computing, low-cost automotive-grade computing power platform chips in the automotive field.

For the pain point that serial interconnection and parallel interconnection are difficult to meet the low-latency and low-cost requirements of the automotive application market at the same time, Xinli Intelligent has successfully created a fully self-developed Chiplet D2D interconnection IP with its original pipeline-based design. , which combines the advantages of high bandwidth and low latency of parallel interconnection technology with high reliability and low cost of serial interconnection technology.

Specifically, the D2D interconnection technology of Xinli Intelligent is a bus expansion interface that serializes the internal bus of the chip. It transmits the serialized bus pipeline, omitting a series of complicated packaging and unpacking protocols. And synchronous operation, and retain the pipeline structure of the parallel bus to the greatest extent.

Especially for the automotive field, the current advanced packaging has not yet fully met the requirements of automotive regulations. Chiplet D2D interconnect IP not only supports traditional automotive-grade packaging, but also has the advantages of low-latency and cost-effective interconnection, which can accelerate signal processing Process, improve the response speed of the system.

Self-developed NPU, winning the second half of intelligence

However, it is not a one-off achievement for Chiplet, which has only been established for two years, to dominate Chiplet in-vehicle applications.

As early as 2012 , when the Chiplet technology had not yet exploded, the founding team members of Core Intelligence realized that because of the high cost and poor reliability, it would be difficult for the Chiplet technology based on advanced packaging to be expanded to other applications other than servers in the future. Field, so set out to study a low-cost, high-performance interconnection technology.

Coinciding with the acceleration of AIGC and the centralized transformation of electronic and electrical architectures, the computing power of automotive chips ushered in an opportunity for a big explosion. In particular, China took the lead in becoming a hot spot for electrification and intelligence in the world, and Xinli Smart was able to show its talents.

In the future, based on D2D interconnection, Xinli Intelligent will continue to realize independent innovation from the entire data link between the processor and the memory, and launch a series of differentiated IPs, so as to maintain its leading edge in technology.

It can be said that by taking advantage of the right time, place and people, relying on the long-term accumulation and hard work of the founding team, as well as the keen judgment on the needs of Chiplet in-vehicle applications, Chiplet has won the ticket and first-mover advantage in the second half of intelligentization.

However, for the intelligent second-half opportunities brought about by AIGC vehicle applications, the demand for large computing power is only part of it. The key lies in the blessing of comprehensive heterogeneous integrated computing power, such as the common graphics acceleration computing power GPU or AI acceleration computing power NPU.

Especially under the ever-shortening car replacement cycle, brand differentiation and diversified competition trends, the comprehensive heterogeneous integrated computing power of vehicle chips is growing exponentially, and the demand for chips is gradually soaring.

With the deep integration of AI technology and the automotive industry, including the application of BEV+Transformer in the field of intelligent driving, and the application of DMS/OMS, AIGC, etc. in the intelligent cockpit, higher requirements are placed on the acceleration computing power of AI. not applicable.

For example, the DSP+ accelerator architecture puts the entire system on one chip, and most of the hardware modules are customized according to specific functional requirements, which have the efficiency of characteristic algorithms. However, DSP+ accelerators are generally not open but use black-box solutions, which are difficult to adapt to the needs of smart cars. differentiated needs.

The GPU architecture , which was previously dedicated to graphics acceleration computing, has gradually evolved general attributes, has the versatility of parallel computing, supports the fusion of all operators, and can efficiently process massive data, but the utilization rate is low, and it is difficult to meet the high efficiency of AIGC vehicle applications. Compute needs.

In contrast, from the perspective of the core NPU of the AI ​​chip, the NPU architecture similar to the GPGPU of Xinli Smart combines the advantages of the GPU architecture and the DSP + accelerator architecture, and can simultaneously meet the network applicability and efficiency optimization of the intelligent driving smart cockpit algorithm. Requirements, both flexibility, efficiency and versatility, support the development and iteration of AI technology.

 It is reported that GPGPU is not a specific chip, but a concept, that is, high-performance computing using graphics accelerators for non-graphics rendering. The former GP means "General Purpose", the latter GP means "Graphic Process", and U is combined as GPU (Graphics Processor).

Specifically, relying on Chiplet technology and taking the embedded high-performance computing platform as the design basis, Xinli Intelligent can realize computing power of different CPUs, GPUs and NPUs through rapid iteration of NPU and flexible upgrade and expansion of CPU/GPU computing power. The flexible combination and customization of units meet the differentiated needs of the smart car market for on-board chips.

It can be said that whether it is using Chiplet to solve the pain points of AIGC vehicle application computing power, or self-developed flexible and efficient NPU to support AI deep optimization, it is enough to see that Xinli Intelligent is ready to win the second half of automotive intelligence.

 

Guess you like

Origin blog.csdn.net/GGAI_AI/article/details/132272959