I suggest Alibaba, Baidu, and Huawei not to rush to grab the "jobs" of big models in the industry!

99955d159bfbe303385faf26570aae64.png

fe0506355e12ed93f67c650d73f89ed8.png




Big data industry innovation service media

——Focus on data·Changing business


In recent months, a very obvious development trend in the field of domestic large models is that everyone is gathering together to build large models in the industry. Not only companies in various vertical fields have released multiple large industry models, but also leading giants such as Baidu, Alibaba, Huawei, Tencent, and JD.com also regard industry large models as a key focus.

Specifically, these manufacturers can be divided into two categories based on the proportion of large models in their strategies:

Baidu, Alibaba, Tencent, and iFlytek pay equal attention to general large models and industry large models.

On the one hand, they pay more attention to general large models and open general large models to C-end users. Baidu Wenxinyiyan and iFlytek's iFlytek Spark even made mobile apps to promote the application of general large models.

On the other hand, they also pay great attention to large industry models. Generally, it is exported externally in the form of industry solutions. Even on September 19, Baidu directly launched a large model of the medical industry - the Spiritual Medicine Model, which directly serves hospitals, patients, and medical device companies.

The other category is represented by Huawei and JD.com, which have focused on large-scale industrial models from the beginning and are directly oriented to industry applications.

Huawei's Pangu model's slogan from the beginning was "Don't write poetry, just do things." It was not enthusiastic about C-end applications, and almost all its strategic focus was on the industry.

JD.com's Yanxi large model is similar. Adhering to JD Cloud's concept of "a cloud that better understands the industry", JD.com also focuses on industry applications in the field of large models. In addition, JD Health also released a large model of Jingyi Qianxun as a pioneer in entering the industry.

It can be found that whether it is Baidu, Alibaba, Tencent, and iFlytek that "pay equal attention to both fronts", or Huawei and JD.com, which are almost "fighting on one front", they all regard the industry's large model as a battleground for military strategists.

The author believes that this is problematic. These technology giants should focus on general large models, and industry applications should be left to partners in various industry fields. They should only build the "infrastructure" of large models and not touch upper-layer applications.

Why do we say this? Next, let’s analyze the pros and cons in detail.

Giants should focus on the research and development of general large models

General large models are like the foundation of the entire large model industry. Whether the foundation is solid or not will determine how high the building can be built. So, is the current foundation solid?

Unfortunately, although large models have initially achieved the "emergence" of intelligence and have made great progress in natural language understanding, content generation, and logical reasoning, they are not good enough. Especially if we want to implement large models commercially in various industries, the current model capabilities are not enough.

The ability mentioned here is not strong enough, and it does not refer specifically to a certain large model.

acf0572d998eef0f937b9ff52d7c507c.png

Even GPT-4 still has significant shortcomings in capabilities if it wants to be implemented in the commercial field. Let's look at a few examples.

Search engines are an important application scenario for large models. Microsoft's Bing has made a great turnaround by changing the original keyword search method because of its access to ChatGPT. So, what is the real performance of Bing with ChatGPT assistance?

We tried it out and it was quite disappointing to be honest.

The following is an example. Let Bing search for news about large models today (September 26). The four news items given are. After clicking in, the contents of items 1 and 3 actually come from a news article, and these are The first piece of news was released on February 21st; the second and fourth items are also from the same news article, which was released on July 27th.

c1667bd8feafafd7437c6642929db067.png

In other words, the news given is wrong. We are looking for today's news, but the result is content from a few months ago. Moreover, we are looking for important events in the field of large models. Among the four answers given, there are two reports, a news analysis article, and a forum activity. Strictly speaking, reports and analysis articles are not important news events. From this perspective, the results provided by Bing completely do not meet the requirements.

The author then asked further and asked him to use a table to sort out the news content given. As a result, in the table he gave, the news time changed to September 26, and when it comes to the specific time, this is obviously nonsense.

10fb5de3a6b6330c11eed3f389806b48.png

The author once had high expectations for new search engines such as Bing and tried them many times. But the overall feeling is - basically unusable. This is the actual performance of ChatGPT in the search field. To some extent, this represents the highest level that large models can achieve.

Baidu has also launched a similar function. In addition to the usual web search, you can also query through conversations. We couldn't wait to try it.

Compared with Bing, Baidu has a better understanding of news events. Bing gives several reports, while Baidu gives results based on large model releases. The news value of these events is obviously higher.

953956b5823c0413086e32ebc3ff7643.png

However, are these results provided by Baidu reliable? Similarly, we allow it to be compiled in tabular form and give news time and links. It can be found that the times are all on May 11th, which is obviously problematic. What we want is the news on September 26th, not May 11th.

98259bf8848e4ebc7ee685abb04cd681.png

Moreover, there is also a problem with the news link given in the table. When opening the corresponding web page, it directly returns "404". Of course, Microsoft's Bing also has this problem. The news links it gives either cannot be opened or do not exist.

3e5d818bd7fb4f77b00b4c5dbd8895cf.png

Back to ChatGPT, one of its important limitations is that it cannot be connected to the Internet, and its data cannot be updated in real time. The training data set of GPT-3 is as of September 2021, and the training data set of GPT-4 is as of January 2022. moon.

1193fb0e8788d78d0dfc48f8066dfc4b.png

2be81345497544856e4feac90c955d30.png

9b537709b16801b3ee505557a2e1fbe1.png

Moreover, ChatGPT often makes mistakes in complex data calculation and processing. Its claimed text uploading and understanding capabilities are also not ideal.

Let’s try GPT-4’s document understanding capabilities. We uploaded the 2023 semi-annual report of Loongson Zhongke and tried to let it do a simple SWOT analysis. After uploading the document, ChatGPT starts writing code to parse the document, which seems to be very powerful.

53049109298820b06de534939c3f0ffe.png

What was the result?

4512db894a50a333ce19a0f579372084.png

In the end, ChatGPT failed to parse the PDF document. We tried several times but failed to parse it.

6474e2afa8b9ca11683da7860ecb2e0f.png

Just imagine, relying on these large models, if you want to implement them in complex industry scenarios, the effect will definitely not be ideal, and they are already the best general large models on the market.

It is true that there has been some "intelligence emergence" in large models, and their capabilities have been qualitatively improved, but they are currently in the initial stage of "little lotus is just showing its sharp edges". Since the discovery of large models is a promising direction, the most important thing to do now is to speed up and cultivate this potential "child" instead of letting it support the family prematurely.

Judging from historical experience, every artificial intelligence craze will be followed by a long period of silence. The main reason is that people's expectations were raised too high in the early stage, and they will be disappointed once they find that their expectations are not met.

Similarly, if we rush to implement large models in various industries now, we will soon have a period of problems, and people will quickly turn from huge expectations to crazy complaints. Such ups and downs are not conducive to the healthy development of the industry. .

Therefore, the core task of technology giants such as Alibaba, Huawei, Baidu, and Tencent is to cultivate the "child" of General Model. As long as the capabilities are truly improved, large-scale implementation will actually be very fast, so there is no rush to wait until now.

There is a well-known intelligence emergence curve in the field of large models, that is, the performance of the model is not linearly related to the parameter scale. A model with 20 billion parameters is not twice as good as a model with 10 billion parameters.

There is a threshold on this intelligence emergence curve. Currently, this threshold is about 100 billion parameters. Before this threshold, the intelligence level displayed by the model does not change significantly as the parameter scale increases. A model with 20 billion parameters performs about the same as a model with 2 billion parameters. However, when the parameter scale crossed the threshold of 100 billion, the performance of the model improved exponentially.

9b3de4bc6e51790293636fc8c04c3547.png

Although model size cannot represent everything, judging from the experience of artificial intelligence development in the past ten years, "quantitative violence" is often a key direction. Larger models, deeper neural networks, and more data will bring about Better performance.

Judging from the current intelligence emergence curve, after the scale of hundreds of billions of parameters, it will enter an intelligence bottleneck period. There may be no significant difference in "intelligence" between a model with 500 billion parameters and a model with 100 billion parameters. However, if we are to pursue the next "emergency threshold", the best way at present is to continue to expand the parameter size. Perhaps, after the parameter scale expands to tens of trillions, the next emergence threshold will be ushered in, and the capabilities of large models will reach a new level.

023caf00ee5624517928de7a8435dc0f.png
Large model intelligent emergence prediction data ape mapping

Of course, as the model scale is expanded, the cost will also increase significantly, so this can only be a game for the giants. Moreover, simply expanding the model size will also cause over-fitting problems. Therefore, the expansion of model scale also needs to be matched with the optimization and adjustment of model architecture. This is where technical capabilities are truly tested.

To take a step back, today's large models are all based on the Transformer architecture, and this architecture was proposed in a paper by several Google researchers five years ago. So is the Transformer architecture really the best? Is there a better model architecture? These questions need to be answered by technology giants such as Huawei, Baidu, Alibaba, and Tencent.

In addition to parameter scale and model architecture, large models also need to solve "illusion" problems, interpretability problems, and multi-modal problems. These problems have not yet been well solved, which is a common problem faced by the entire industry. The key to solving these problems lies in the underlying technological breakthroughs in general large models, rather than in industry large models.

Of course, whoever can truly solve these key problems will be rewarded accordingly by the market.

Don’t be a referee and a player at the same time

The reason why it is recommended that technology giants not touch the industry's big models is that in addition to the unresolved problem of general large models, another very important reason is to avoid conflicts of interest with partners.

For technology giants, they play an ecological game and share the benefits of infrastructure.

In the field of large models, the value transmission route should be general large models - industry large models - industry customers. In the industry large model stage, general large model manufacturers such as Huawei, Baidu, and Alibaba can either develop industry large models themselves or let third-party partners conduct research and development based on their own general large models.

f1a0b6a380e0562fcb83a83c979fb92d.png
Large model industry application value transmission mechanism data ape mapping

General large models test technical capabilities, while the technical threshold for industry large models is not very high. Its core elements are data and industry experience, and these two points are the shortcomings of technology giants. To gather high-quality data sets from various industries such as finance, medical care, manufacturing, and retail, and to understand the business scenarios of various industries, it is definitely not something that one company can do. It must rely on the power of the ecosystem and use thousands of data from the entire ecosystem. Partners do it.

Of course, general large-model manufacturers such as Baidu, Huawei, and Tencent can also occupy both value transmission routes. For example, in the medical field, Baidu can not only use its own large-scale spiritual medicine model to directly serve hospitals, patients, and medical equipment companies, but also promote the construction of a vertical medical large-scale model partner system.

However, this situation will face the problem of "competing with the people for profit", which is a taboo in business.

Imagine that a certain large-scale medical model company A builds on the general large-scale model of company B, opens its core medical data to B, and trains a large-scale medical model. A few months later, A discovered that company B had also launched a large medical model, and its functions were similar to its own. When an industry customer placed an order, he discovered that Company B was also bidding, and his partner suddenly became a competitor. If this is the case, is Company A still willing to cooperate with Company B?

In an ecosystem, partners' trust in the ecosystem owner is as valuable as gold. Only when the upper-level application partners firmly believe that the eco-owner will not have a conflict of interest with him or steal his business, will he feel confident placing his business on the platform built by the eco-owner.

This is somewhat similar to the relationship between IaaS vendors and SaaS vendors in the field of cloud computing. The most critical reason why many SaaS companies in China are uneasy about cloud vendors such as Alibaba Cloud, Tencent Cloud, Baidu Cloud, and Huawei Cloud is that they are afraid of conflicts of interest. At present, the business boundaries of IaaS cloud vendors are not clear enough. They not only provide IaaS and PaaS products, but also enter many SaaS fields, which is the most taboo for their SaaS partners.

In the early stages of China’s Internet, investors had a famous soul-searching question for start-up companies – what would you do if Tencent made the same product?

In the same way, if general large model manufacturers want to build an application ecosystem, then industry large model manufacturers in the fields of medical care, finance, government affairs, manufacturing and other fields will also ask - if you make something like me in the future, what should I do?

So what kind of large model ecosystem is more reasonable? We can learn from the cloud computing ecosystem. The general large model is equivalent to IaaS, and the industry large model is equivalent to SaaS.

Baidu, Huawei, Alibaba, Tencent, JD.com, ByteDance, iFlytek and other leading general model manufacturers focus on general large models (IaaS+PaaS) and try not to touch industry large models (SaaS). Demarcate business boundaries.

It should be pointed out that even if they do not make industry large models, underlying general large model manufacturers can still share in the industry application dividends of large models. Just like SaaS applications consume IaaS resources and pay for IaaS, the upper-layer industry model will call the capabilities of the lower-layer general model, and a reasonable business model can be built based on the number of calls and usage.

For example, Baidu does not make large-scale medical models, but it has 10 large-scale medical model partners based on Wen Xinyiyan, and each partner serves 1,000 hospitals. Assume that each hospital pays 1 million yuan per year, and Baidu shares 20% of this 1 million yuan. Then each large-scale medical model company can earn 1 billion yuan per year, and Baidu's revenue is 1 billion * 20% * 10 = 2 billion yuan. In this way, Baidu only needs to serve 10 partners, instead of serving 10,000 hospitals.

By analogy, if a prosperous industry large model ecosystem can be built, the industrial application of large models can also bring tens of billions of revenue to underlying general large model manufacturers.

For general large model manufacturers such as Baidu, Huawei, Tencent, and Alibaba, there is no need to worry about missing out on the dividends of industry large model applications. Just like in the field of cloud computing, which SaaS vendor's revenue can match that of Alibaba Cloud, Tencent Cloud, and Huawei Cloud that provide IaaS?

As long as you concentrate on laying the foundation of the general large model, you can then sell the "land" without having to laboriously move bricks to build a house. Let’s think back to the real estate field. Are real estate developers like Vanke and Evergrande the most profitable? It is obviously more profitable and easier to sell land.

For large model manufacturers in vertical industries, their most ideal state is to learn from SaaS cross-cloud deployment strategies to achieve cross-general model deployment of industry large models, and to smoothly migrate business from one general model platform to another. In this way This avoids being bound to a single platform. Of course, the industry's large models are currently in a very early stage, and it is still too early to talk about cross-general model deployment.

de2c57e1dbaa255fd66c59dec73ee48c.png
Cross-model deployment mode data ape mapping for large industry models

In summary, it is recommended that technology giants such as Baidu, Huawei, Alibaba, and Tencent focus on the research and development of general large models rather than the application of industry large models.

On the one hand, general-purpose large models are not good enough yet. Problems such as insufficient intelligence level of the model, hallucination problems, poor interpretability, poor multi-modal fusion ability, and high cost of model training and inference are still prominent. Technology giants should solve these problems. Lower level, more challenging puzzles. Only when these problems are solved can the foundation of large model industry application be solid.

At the application level of large model industries, it can be completely left to the upper-level vertical field companies to complete. It is foreseeable that there will be hundreds or thousands of large-scale industry model companies competing in each field. In the end, dozens of companies will survive and the fittest will survive. These surviving companies are qualified partners. The underlying general large model manufacturers should build an ecosystem with their partners to jointly serve industry customers.

Text: Yicai Yanyu  /  Data Yuan

02a662836a40d2a597a819f7d9ff05ab.jpeg

893ce793a8dd2027f65747c32f0339fa.png

Guess you like

Origin blog.csdn.net/YMPzUELX3AIAp7Q/article/details/133326280