The gross profit margin is as high as 60%! Intelligent driving cost reduction "game" data labeling/training service

Since Mobileye entered the car pre-installation track, computer vision technology has become the protagonist in the field of intelligent driving. Behind it is the processing of large amounts of data.

In the view of the founder of Mobileye, computer vision technology and powerful natural language understanding (NLU) models can output thousands of results in seconds, even for "long tail" events in rare conditions and scenarios.

Prior to this, Mobileye had a globally recognized large automotive data set, including more than 200 PB of real-world driving video footage in the past 25 years, with a total of 16 million 1-minute video clips.

According to the official data previously disclosed by the company, behind this huge data set are the manual annotations of more than 2,500 professional annotators. time between failures.

Almost at the same time, a large number of start-up companies targeting L4 autonomous driving scenarios led to the first wave of demand for manual labeling of data. Data labelers have also become the focus of job recruitment for most companies, and have also spawned a number of third-party data labeling companies.

Such demand is not limited to the intelligent driving track, but also includes almost all fields of AI applications, from voice to image to video, and data labeling has gradually developed into an independent industry.

Four years ago, a company named Haitian Ruisheng began preparations for listing on the Science and Technology Innovation Board. This company was established in 2005 and has been providing data services for artificial intelligence companies and scientific research institutions, including data resource customization services, database products, data These three categories of resource-related application services.

The company's data training (labeling) services cover intelligent speech (speech recognition, speech synthesis, etc.), computer vision, natural language and other core areas of artificial intelligence, and ultimately serve human-computer interaction, intelligent driving, smart cities, etc. An innovative application scenario.

According to the prospectus, Haitian Ruisheng’s revenue in 2018, 2019, and 2020 was 193 million yuan, 238 million yuan, and 233 million yuan respectively; its net profit was 67.13 million yuan, 81.5868 million yuan, and 82.081 million yuan.

However, starting in 2021, the industry seems to be entering a period of volatility. The data shows that in 2021, Haitian AAC's operating income will be about 206 million yuan, a year-on-year decrease of 11.53%; the net profit attributable to shareholders of listed companies will be about 31.61 million yuan, a year-on-year decrease of 61.49%.

The 2022 annual report shows that Haitian AAC’s revenue in 2022 will be 263 million yuan, a year-on-year increase of 27.32% (among which, the revenue from the intelligent driving business will increase year-on-year); the net profit attributable to shareholders of listed companies will be 29.4541 million yuan, a year-on-year decrease of 6.81%.

Specific information shows that in 2022, the company will separately disclose the revenue of smart driving business for the first time. It achieved revenue of 41.7451 million yuan in that year, a year-on-year increase of 115.12%. The number of customers exceeds 50, covering traditional car companies, new power car companies, and smart driving technology companies. wait.

The performance driver behind this comes from the emergence of new demands for 2D+3D fusion annotation, point cloud BEV, and 4D annotation on the pre-installed intelligent driving track in the past few years. At the same time, government regulators' requirements for data compliance also drive a new round of reshuffle in the labeling industry.

In June 2022, Haitian AAC obtained the administrative license from the Beijing Municipal Commission of Planning and Natural Resources, and obtained the Class B surveying and mapping qualification, which means that the company can carry out vehicle data collection and labeling business in compliance.

However, the intelligent driving industry chain is entering a new cycle of "removing" manual labeling. At the same time, in the past, the industry's demand for third-party suppliers came from data collection and manual labeling (some companies are also developing semi-automated tools to improve efficiency).

For example, Tesla has always had a large data labeling team with a size of about 1,000 people. On a daily basis, the team annotates objects in video data in a "vector space," which is used for neural network training.

However, in June 2022, it was suddenly reported that Tesla had closed an office in San Mateo, California, USA, and dismissed about 200 employees of the assisted driving system team, who were engaged in data labeling. Work.

One of the reasons is the supercomputer Dojo launched by Tesla, which is a supercomputer that uses massive video data for unsupervised labeling and training.

According to Haitian AAC’s announcement, the data service fee paid by the company for purchasing raw material data collection and labeling accounted for a relatively high proportion of the total annual purchases related to the main business, reaching 83.21%.

As a result, Tesla has introduced automatic labeling capabilities into the research and development of autonomous driving technology, which greatly reduces the cost of manual labeling and improves the overall labeling efficiency. This point has also been quickly responded by other companies.

For example, in order to meet developers’ needs for AI software product development and iteration, Horizon also launched AIDI, a one-stop tool platform, which reduces data labeling costs by 15% through pre-trained models and automatic labeling.

Momo Zhixing also launched the 4D Clips automatic labeling technology this year, which can reduce the labeling cost of a single picture to 0.5 yuan, which is 1/10 of the current industry average cost. At the same time, the large model is also opening up the capabilities of the cloud to the outside world, including automatic labeling of large-scale data, scene simulation testing, etc.

In Haitian AAC's view, in the long run, the development of the training data industry will change from resource-led to technology-led. It is unrealistic to rely solely on manpower growth to cope with the exponentially growing data demand. Therefore, the degree of automation of data processing will continue to be improved. It will become the core element to solve the capacity problem.

Previously, the company also clearly pointed out that if the training data set products developed by the company cannot meet the requirements of downstream customers for training data, there may be a risk that authorized sales cannot be realized, which will adversely affect the company's future operating performance.

As the main customers of early data labeling companies, autonomous driving companies are also entering this track, trying to convert past closed-loop research and development experience into external profitable businesses.

Just at the beginning of this year, Pony.ai launched the data closed-loop tool chain "Kangqiong", which is composed of two core modules, namely, the car-cloud collaborative big data platform and the cloud-based large-scale simulation platform, with data labeling tools and model training tools. Achieve full coverage of the core needs of two types of customers.

At present, the KQ data closed-loop tool chain has been provided to many car companies to help build a full-scale data closed-loop in the R&D testing phase and a data closed-loop based on shadow mode in the mass production phase.

According to Haitian AAC's latest disclosure, it believes that the main players in the smart driving market (data labeling service) include brand data service providers, customer self-built teams, and some small and medium service providers. Judging from the current industry structure, brand service providers occupy a large proportion of the market share.

According to the 2022 financial report disclosed by the company, the comprehensive gross profit margin of the data service business is still as high as 64.73%. Considering that the smart driving track has entered a new cycle of data-driven development, whether it is a car company or a Tier1, the demand for cost reduction will have an impact on data. Serving the track makes a huge impact.

In addition, Haitian AAC stated that since the emerging business represented by intelligent driving is currently based on customized services for customers, during the reporting period, the increase in operating income also led to a simultaneous high increase in operating costs, reaching 163.67%. This resulted in a slight decline in the gross profit margin of the segment.

Especially with the gradual increase in the penetration rate of high-end intelligent driving, the demand for traditional externally collected data is also gradually decreasing. "The quality requirements for data collection are rapidly increasing. In the future, high-quality data must come from car companies. This is fundamentally different from the need for more pre-research in the past."

In addition, since the operating profit of most data service companies is greatly affected by costs and management expenses, and the industry has gradually shifted from relying on manual standards to highly relying on high-end technicians to develop automation tool platforms, costs have increased year by year, and gross and net profit margins have declined. The risks are highlighted.

Guess you like

Origin blog.csdn.net/GGAI_AI/article/details/131932018