The first domestic AI search was born, changing the life of traditional search engines! Unlimited follow-up, farewell to advertising

5b646b3ef8904a96c48bd4a1c8e8d50f.gif

f1ef06a71291d3a3d32fa873d918b739.jpeg

Image source: Generated by Unbounded AI

In the era of large models, what kind of search do we need?

The shocking appearance of ChatGPT made everyone realize that it is time to reconstruct the mode of human accepting and processing information.

Previously, a UC Berkeley professor made an astonishing prediction that the GPT in 2030 can learn the knowledge that humans need to learn for 2,500 years in one day. 

Although it is not yet possible to compete with silicon-based life on this track, there is no doubt that in the era of knowledge explosion and information overload, the information we need is not just the information that is randomly stacked in front of us after a simple search. Unchewed "raw food".

10024c4b65b8e65d3a15cd89dfa46f86.jpeg

What we need is a "finished product", a search tool that understands people's hearts more, more real and practical information, and a more authoritative and reliable source of information.

If this tool is empathetic enough to accurately guess our intentions and provide us with a steady stream of inspiration through pertinent questioning, it will be even more perfect.

Now, none of the above is a fantasy!

Just the day before yesterday, Kunlun Wanwei officially launched the first search engine in China that incorporates a large language model - Tiangong AI Search, and at the same time opened the application for internal testing (internal testing address: tiangong.cn).

e5894cf91dc8c6d2799b55e546cb68ee.png

As the first batch of users to participate in the internal test, after two days of in-depth experience, the editor feels that it is smarter than traditional search, more real-time than GPT-4, and more accurate than other AI searches.

The singular moment of traditional search has really come!

8ab5feb036acf467f7369172d85eafec.png

01

How search is entering the age of AI

Having defeated so many opponents, how did Tiangong AI Search do it?

The strongest feeling after the editor's experience is that for the first time, search has become humanized.

Intelligent search, comprehensive summary

In traditional search engines, we will input keywords, and what follows is a vast amount of information.

In this infinite possibility, we need to waste countless hours, exploring like a "treasure hunt", and there may not be any results in the end.

The AI ​​search based on the ability of large models is a generative search. Users can clearly express their intentions through natural language. AI search provides organized and refined answers, not "information" but "knowledge". .

864490ee14343a79405fe4d7e36947e2.png

The ability to integrate, refine, and connect information brought by large models enables AI search to better deal with open-ended problems. When dealing with knowledge and creative searches, the performance is also better than traditional search engines.

As far as Tiangong AI search is concerned, it is similar to traditional search engines, and will first display the information source of the search results.

Then, a summary generated by the AI ​​​​big model is given.

Finally, it is matched with the follow-up question generated by AI, and the result presentation method of "link-answer-question" is formed.

efd8ca070b1b4f902e69a3a0b79fcaeb.png

Moreover, based on the understanding of the context and semantics, Tiangong AI Search can continuously help users discover their real search intentions, solve users' actual problems and difficulties, and realize in-depth research on complex issues in the form of AI summary + multiple rounds of dialogue .

For example, if we need to lose weight now, but we don’t know anything about it, we enter “how to lose ten pounds in 10 days” in a traditional search engine, and then we have to face massive links.

a78bfb090dcdf9293949b7312a4783e9.png

In Tiangong AI search, what is given is not an isolated link, but an organic series of results.

The advantage of this is that it avoids the situation of "a bunch of links hitting your face in disorder" in traditional search, and you don't have to worry about information overload. The speed and comprehensiveness with which we understand information will be greatly improved.

Because Tiangong AI search puts links in context, sorts out and presents them organically, so that users can grasp the main points at a glance.

30e98f695693c00e84d7c591b8897755.png

Based on the above answers, we can naturally ask further questions about the two core methods of exercise and diet.

a281dd8a6e74e9c0a42b6e206545c4d1.png

Based on the questions it provides, we can then ask: how should we plan our diet.

d72220065f5c6e7fa5fed37c2babee02.png

The answer is very reliable. There is no "machine cuisine" invented by AI itself, and the condition of only "beef and chicken" is well satisfied.

Finally, we can ask it to summarize a supermarket shopping list based on recipes.

b0ad0efa945b8ab544b53f28ca0d5c52.png

In just 10 minutes, we got a complete weight loss program, which is extremely operable.

Let a fitness novice successfully obtain the "knowledge" that can guide him to lose weight from the vast amount of "information" on the Internet!

In summary, the results given by traditional search engines are ranked according to the relevance of the links obtained by each algorithm and the question, and presented from high to low. The logic between each link is always just an abstract correlation ranking, with the higher ones at the top and the lower ones at the bottom.

Tiangong AI search is an intelligent version of traditional search. It can summarize the content of each link and the possible logical connections between them through a large language model, so as to help users obtain information that is relevant to them faster and better. Helpful answer.

By the way, the editor also asked the same question to Bing Chat, but its answer was very perfunctory.

Just listed the permutations and combinations of several foods. Completely ignoring the requirement of not repeating samples for 5 days.

f886e198573b447dee29f8274b5de024.png

At the same time, Tiangong AI Search also realizes the identification and screening of advertising webpages through a large model, which solves the ubiquitous advertising problem in traditional search engines.

In this way, it is ensured that users can obtain pure and high-quality search results, and they do not have to worry about being misled by advertisements during use.

649076e9c0f938346ce8d4ace2276374.jpeg

Next, let us carefully dissect several unique "superpowers" of Tiangong AI search.

Unlimited questioning, interlocking

Among these abilities, the one that impresses me the most is undoubtedly "Infinite Questioning".

When using traditional search, if we want to continue to understand a topic in depth, we not only need to start a new round of search from the beginning, but also have to think about which old keywords need to be kept while adding new keywords, so that the search The engine doesn't digress.

In addition, in order to easily go back to the answers to previous queries, we have to keep multiple browser tabs, which is extremely cumbersome to operate.

210d1f678aa21de6385ddc0227c51bd0.png

However, Tiangong AI search can carry out in-depth exploration through more than 20 rounds of interaction, pushing us to approach the ultimate answer step by step.

Take the self-made algorithm engineer interview question that was easily solved when the Tiangong model was just released as an example.

6305d5e04ae947a68134658f90759ad2.png

After some "learning on the Internet", Tiangong AI Search quickly provided a Python implementation based on the dichotomy.

674f8d851259b68211e5a8bd16ef4bf6.png

Immediately afterwards, Tiangong AI Search gave three further questions based on the question itself and the method it used.

Obviously, if we want to learn more about the solution to this question, just click on question one.

0181da9debe4974a7b5f52ae50fae838.png

In response to this problem, Tiangong AI Search provided two new methods, the "interpolation method" and "Fibonacci method", in addition to analyzing the previously used "dichotomy method".

039d3a8ff6fdcc2140e191674a78e6eb.png

If you want to learn more about the implementation of the interpolation method, just tell Tiangong AI Search directly, without repeating the previous questions at all.

c60fa3839bb759eb7c0a9f744a303939.png

When using Tiangong AI to search for "Question", the editor suddenly felt that this process was so familiar.

a429e98a06c86bfdc2dbadd1bbc8bd0d.png

Ancient Greek philosophy is a system of in-depth inquiry and precise logic. Exhausting all things and finding their roots, human beings are exploring the origin of the universe in their pursuit.

It seems that whether it is to learn a new knowledge metaphysically or to write an academic paper specifically, Tiangong AI Search will definitely have a lot to do in the process of assisting us to open our minds and develop reasoning.

Tracing back to the source, the answer is reliable

In the constant questioning, Tiangong AI search helps us to solve our doubts, but how to confirm that the answer is correct?

A major pain point of traditional search is that information from different sources is mixed. On the other hand, the generation mechanism of the large model cannot avoid the phenomenon of "serious nonsense".

a1fa0de702151b2c2891591b04895f4f.png

Here, another major feature of Tiangong AI search is that under all the answers, the source index is listed for us to verify the information.

As a result, anyone can examine the accuracy of the answer, thereby ensuring that the answer is traceable, verifiable, and reliable.

For example, let Tiangong AI search and answer what is the development prospect of large-scale language models?

Tiangong AI Search gave 4 development trends, and listed 6 information sources above the answers, covering various media sources such as Zhihu.

ad9839b6b29aee389b6da18cebbf76a5.png

If you cannot confirm the second point, you can read the full text to learn more according to the source of the annotation.

661d7a602e068cb6e60b49a22dd1eff8.png

Or if you have questions about point 3, please refer to the sixth link.

71210b1e4443f21a14e135c2e3ab76be.png

In addition, the search results of each round will be kept in the historical record, which is convenient for searching at any time, and can even be shared with others with one click.

ed1090b2c88e61aabd537621b8d5c91f.png

Tailor-made, thousands of faces

The empowerment of large models makes it possible to recognize intentions, and the Tiangong AI search that "understands people's hearts better" will give us more accurate and personalized answers.

For the first time, we got a "tailor-made" and "thousands of faces" experience in search.

The editor asked two questions separately and set different initial weights. I asked Tiangong AI Search to help me formulate a weight loss and fitness plan.

a15d428e068a07389ae7982e17744b6d.jpeg

In the answers and questions given by Tiangong AI Search, it specifically emphasized the safety of exercise to lose weight for people with large body weight, and reminded users to avoid sports injuries.

In the questioning and answering, the method of "low-impact aerobic exercise" was even specifically proposed to prevent damage to the body during the weight loss process.

ae35a494aeb58f7f929330bb05f16f96.jpeg

In another question, when we set the weight at 80 kg, the answers and questions given by Tiangong AI Search did not involve avoiding sports injuries, but emphasized the effect of exercise and the development of exercise habits.

abba256b640e5ef225d8483f3c683e90.jpeg

There will also be a big difference between the answer that the user gets after using the follow-up question to ask further questions and the setting of the follow-up question and 150 kg.

It is also a question about fitness and weight loss. As long as users can provide as many details as possible to Tiangong AI Search, they will get more customized search results and replies.

dc53201c6fce6ca542aa3c8658ce252f.jpeg

This "tailor-made" and "thousands of people and thousands of faces" customized search experience, in a search environment that supports multiple rounds of dialogue, relies on the intent recognition, user feedback reception, context perception technology.

This kind of experience does not belong to the same era as the traditional search that only relies on keyword matching!

Real-time information, avoid hallucinations

In addition to the search engine, compared with the traditional large language model, even if it is connected to the network plug-in, the real-time information searched by Tiangong AI is still more real-time, and the answers given based on it are more complete.

For example, the discussion on room temperature superconductivity is very hot recently. We can use several search tools to check and follow up the latest papers.

The links given by Tiangong AI Search include papers on arXiv, Zhihu discussions and news reports, bringing together the latest developments of the event from multiple channels.

Moreover, the generated answers not only include an introduction to the content of each paper, but also identify superconducting events as "differences and disputes" from a more macro level.

What is even more bright is that the papers it gives include the most important copper sulfide paper of the Chinese Academy of Sciences. This paper is one of the most important basis for the outside world to identify the latest progress in superconducting events.

99d53bd2fb84c90ef541ad6d4f1dea03.jpeg

Next, it was the turn of the GPT-4 players to play.

5ef063fada905c8b7a9f11460413f8bc.jpeg

It also provides 3 papers, each with an abstract, supported by a networking plugin.

However, these three papers are all published earlier and support "LK-99 is a room temperature superconductor", and they do not objectively reflect the latest progress of the LK-99 incident as a whole.

Obviously, compared with GPT-4, the results given by Tiangong AI search are more comprehensive and time-sensitive, and better restore the whole picture of the incident.

In today's search, whoever can grasp the advantage of timeliness can give users the most accurate information. In terms of timeliness, the GPT-4+ networking plug-in still has a certain gap with Tiangong AI search.

29ac1c8fa8426483dee9c79f0b1fa51c.jpeg

In addition, Tiangong AI Search uses links to trace the source of information, which can greatly avoid the "illusion" of LLM.

The editor casually asked GPT-4 a story about Chinese history. Probably because the training data of GPT-4 does not include "Zi Zhi Tong Jian", it really started to make nonsense.

521b395b513dc6ec9205c5ce67a9ac8a.png

And Tiangong AI search, which can trace the source through links, has its own networking function, which completely eliminates the possibility of "illusion".

c1cc0efef5aac6a55bbc4d77bec83fd8.png

And even for GPT-4, which had hallucinated before, as long as it is equipped with a networking plug-in, it can find the correct answer immediately.

4d045b4939b38365120346958180841a.png

It can be seen that the framework of AI+ search is the lore for the "illusion" of large models!

02

Technology behind decryption

So, what kind of technology is behind this, expanding the ability of Tiangong AI search?

The core is still a large model.

On April 17, Kunlun Wanwei released a large-scale language model with a scale of 100 billion for the first time - "Tiangong".

It has demonstrated extraordinary abilities in the fields of copywriting, knowledge question and answer, code generation, logical inference, mathematical calculation, etc. After several technical iterations, "Tiangong" has reached or even surpassed industry standards in many dimensions.

Technically, "Tiangong" is deployed on the leading GPU cluster in China, and integrates a 100 billion pre-trained base model and a 100 billion RLHF model, which can be called a model that "works hard to create miracles".

At the same time, the model also introduces the Monte Carlo search tree algorithm, which makes the output more humanized. You know, this algorithm is combined behind the famous AlphaGo.

aef7cad97561bce8d8af57232c93ced7.png

It is worth mentioning that the Tiangong team has cleaned and screened 3 trillion word data from tens of trillions of data for the training of the large model, so that the large model has excellent Chinese context, vocabulary and grammar processing capabilities .

It is the technical breakthrough and unique advantages of the "Tiangong" large model that can greatly expand the capabilities of Tiangong's AI search.

- Large model Query intent recognition and understanding

Before searching, after the large model rewrites the query of the user's question, it can dig deep into the user's true intention and quickly capture the context.

Compared with traditional search, it can provide more accurate search results and even greatly simplify operations.

For query rewriting, the large model reorganizes, adjusts or replaces the query to make it more accurate, concise and easy to understand.

As for intent recognition, its main task is to identify the intent or purpose behind user queries in order to better understand user needs and provide them with accurate answers or suggestions.

- Asking technique

In Tiangong AI search, the most characteristic and humanized design is the ability of "questioning".

Its purpose is to accurately capture user intent and provide the most relevant search results.

The core of this technology is to understand the user's query and ask the user to ask for more information.

Its implementation principle process is as follows: intent recognition; information completeness detection; question generation; user feedback reception; dynamic adjustment and learning; context awareness.

In addition, in order to achieve infinite questioning, a large amount of data is required for training, and continuous iteration and optimization are required to meet the changing needs of users.

- Information intelligence summarization and application of retrieval-based large model technology

To deal with the challenge of answering open-ended questions, "Tiangong" adopts Dense Passage Retrieval (DPR) technology.

DPR has natural advantages in dealing with "long documents" and "complex problems", and can give excellent retrieval results.

5c35cf28ed90aa83af455af517100696.png

In order to meet different application scenarios, DPR provides two core implementation methods, each with its own advantages:

1. single-vector: Encode both the question and the document into a single vector.

2. multi-vectors: Multi-vector encoding is performed on the document, but the problem is represented by a single vector.

The first method is admired for its compact storage and retrieval capabilities, but may be slightly less efficient in some scenarios. In contrast, although multi-vectors require larger storage space, their retrieval accuracy is usually better.

- Vector Semantic Retrieval

Here, Kunlun Wanwei has also built a large-scale real-time vector retrieval system, which plays a role in multiple links of search, such as precise content positioning, enhanced content diversity, and intelligent contextual coherence.

a861163974cf31e4252c001f430dc5d9.png

- Cross-language retrieval and information integration

By adopting cutting-edge cross-language information retrieval technology (CLIR), Tiangong AI Search can also search deep into the English knowledge base and academic literature, even if we ask questions in Chinese.

For example, ask "What is the Transformer architecture?"

In the reference content of Tiangong AI search, links to 2 foreign articles are given.

4ee731753be8fc872d45d030d866d55b.png

Behind this is the use of the excellent cross-language comprehension ability of the "Tiangong" model to expand the boundaries of search knowledge and allow us to understand global information and research results in the first place.

Then, how to achieve cross-language retrieval and information integration has the following steps:

Query translation; retrieval and ranking; document translation (if needed); information integration; feedback and optimization; deep learning and representation learning.

This complete process requires the integration of multiple AI capabilities, including machine translation, information retrieval, data fusion and deep learning. In addition, a large amount of bilingual data, user interaction logs, and high-quality document data also improve the efficiency of CLIR.

From the above, we have seen the relationship and evolution between the "Tiangong" large model and AI search.

03

Reinventing search with big models

Today, the unprecedented explosion of large language models such as GPT-4 has buffed various applications, and search is no exception.

AI search is an innovative form of combining large model + search technology.

After the birth of ChatGPT, a voice in the industry believes that search giants such as Google and Bing will be subverted.

f557e27a78fe779afb9d7ed2cc77ee60.png

As a high-frequency entrance for users to obtain information, search will surely become the core application scenario for the implementation of large models, and truly release the huge productivity contained in large models.

In fact, from a foreign perspective, some technology companies have used large models to empower search and provide users with a better experience.

Microsoft was the first to integrate the GPT-4 model into New Bing, greatly upgrading Bing's search capabilities and providing everyone with an intelligent AI boost.

At the Google I/O conference, Pichai announced a disruptive search generation experience (SGE), which provides a summary of question answers and cards that show the source of the article.

The new AI search engine driven by PaLM 2 has directly changed the underlying logic of Google search.

In addition, there are DuckDuckGo, You.com, Perplexity.ai, all of which integrate large models into search.

On the other hand, in China, Baidu, 360, etc. have made breakthroughs in the application of large models, and they are also the first to apply large model capabilities to search.

As a world-leading Internet company, Kunlun Wanwei will also put it into practice, so that the ability of large models can better assist search.

In 2020, this forward-looking leading technology company will start to deploy AIGC and large-scale models.

In the past three years, Kunlun Wanwei has released a full range of algorithms and models Kunlun Tiangong in the AIGC field, as well as various generative AI tools, and has open sourced various projects.

With the help of large models, Tiangong AI Search has the ability to shape the boundaries of "search links everything", which will reshape the search form and experience.

Tiangong AI Search, as the first AI search product put into application in China, is an important milestone for Kunlun Wanwei to continue to cultivate in the AI ​​field.

The future has come, and Tiangong AI Search will become everyone's productivity assistant.

Babbitt Park is open for cooperation!

8d4e18e1a6e234de2883e1a03cd68586.png

648ffc2f92d90c2d24289002c41345a7.jpeg

299689a4943c2126a9df4bbdd1a14c47.gif

Chinese Twitter: https://twitter.com/8BTC_OFFICIAL

English Twitter: https://twitter.com/btcinchina

Discord community: https://discord.gg/defidao

Telegram channel: https://t.me/Mute_8btc

Telegram community: https://t.me/news_8btc

139e6ae54ea9242c06178e6deb2c6db3.jpeg

Guess you like

Origin blog.csdn.net/weixin_44383880/article/details/132505120