Five minutes to understand what is the principle behind the GPT model? Why can the GPT model generate meaningful text? Why can't the GPT model do simple math problems? Why are some people worried that GPT models may harm humans?

Five minutes to understand what is the principle behind the GPT model? Why can the GPT model generate meaningful text? Why can't the GPT model do simple math problems? Why are some people worried that GPT models may harm humans?

0. Introduction

Since the relevant content of the GPT model is very rich, I plan to conduct more in-depth study and research on it, and apply it to my work, life and study to improve work efficiency, quality of life, and learning effects.

According to the first principle, before starting the actual combat exercise, I think it is necessary to understand the principle behind the GPT model, so as to avoid blindly worshiping it, and avoid ignorantly underestimating it, and apply it with a more rational attitude .

I saw an article introducing the principle of ChatGPT before: What is ChatGPT doing...and why does it work? The full text is more than 30,000 words, including more than 100 pictures, and will be published as a book on March 9, 2023.

insert image description here

I checked the author of the original text and found that he is a very powerful man. He is the founder of the mathematical software Mathematica - Stephen Wolfram (Stephen Wolfram). He is also a famous complex scientist who has studied neural networks for more than 40 years. years, and invented the Wolfram Language.

insert image description here

Combining Wolfram's article, Google team's paper, ChatGPT's answer, and Wanwei Steel's AI frontier course, let's put aside some technical details, combine with your own understanding, and try to use more popular language to explain Explain the rationale behind the GPT model.

insert image description here

1. Why can the GPT model generate meaningful text?

The GPT model is essentially a "reasonable continuation" of the text based on a large amount of language data. Its core is the "big language model" ( LLM).

To put it simply, the principle of the GPT model is somewhat similar to playing the game of "Word Solitaire".

For example, use CSDN's "wet writing" articles as "learning materials" to train the GPT model. When the word "I" is input to it, it may then generate a word "yes"; "I" and "is" are combined into "I am", and according to the probability of the word appearing, the next word "write" may be generated, and then "I am" and "write" are combined into "I am writing", repeating This process can generate a meaningful text, such as "I am writing wet".

insert image description here

We call the above process " autoregressive generation ", which belongs to an unsupervised natural language processing (NLP) model. A bit similar to the smart input method, it can automatically predict the words that may need to be input next based on the user's input and based on the input words, so as to help users increase the typing speed .

However, if the GPT model always picks the word with the highest probability, it will usually get a very "normal" answer (sometimes even a cookie-cutter answer).

However, when the GPT model randomly picks words with relatively low probability, it is possible to get "more interesting" answers (sometimes even feel very creative).

So, the GPT model doesn't answer the same every time, which makes it feel smarter.

But in fact, it does not currently have autonomous consciousness. The early version of the GPT model is even like "parrot learning", and it doesn't even understand what it said.

The underlying principle of the GPT model is actually to build a huge neural network based on the Transformer model proposed by the Google team. Its prominent features are big data, big models, and big calculations .

In fact, to put it bluntly, it means " doing miracles vigorously and calculating violently ".

After a large amount of data pre-training and a large number of calculations, the GPT model has demonstrated amazing language understanding and generation capabilities, and can selectively remember the key points of the previous article to form a thinking chain reasoning ability.

Therefore, the GPT model can "understand" human intentions, carry out multiple rounds of effective communication, realize intelligent question-and-answer exchanges, and imitate the writing style of well-known writers, and even complete the creation of poems, so that the content is complete, the focus is clear, and there are generalizations , Logical and organized.

2. Why can't the GPT model do simple math problems?

Although the GPT model has strong language capabilities, it is not very good at mathematical problems.

For example, I just input some numbers and let ChatGPT do a simple arithmetic problem:

123123 ∗ 2080 + 321321 ∗ 8020 = ? 123123*2080+321321*8020 =? 1231232080+3213218020=

As a result, ChatGPT solemnly gave a wrong answer: 2832402360 28324023602832402360 , but there are several wrong digits in the middle, the correct answer should be 2833090260 28330902602833090260

insert image description here

Why does GPT have a strong reasoning ability, but even such a simple calculation problem is wrong?

The fundamental reason is that GPT is a large language model. Its thinking is very similar to the human brain, and the human brain is not very good at calculating this kind of math problems. With the help of tools such as calculators .

Therefore, GPT is actually more like the human brain than a general computer program.

It is estimated that the human brain has about 100 billion neurons, and the model parameters of GPT-4 far exceed 100 billion. , there will suddenly appear some abilities that you didn't have before. It's like when the number of ants is large enough, they suddenly have some kind of organizational ability.

3. Why are some people worried that the GPT model may harm humans?

Although the GPT model is not yet good at solving some mathematical problems, in fact, as long as you add appropriate plug-ins to it , when you encounter areas that it is not good at, you can use multiple thinking models and call other models to solve them.

For example, when combined with Wolfram, some mathematical problems can be easily solved, which is like adding a calculator to human beings, and the arithmetic ability can be significantly enhanced.

insert image description here

Because the GPT model itself is an unsupervised algorithm, it is like a black box, which often produces unpredictable results without knowing the specific reasons, so it is unavoidable to worry: will it do some harm? What about human affairs?

Judging from historical experience, science and technology is a double-edged sword. If it is used well, it can benefit mankind, but if it is not used well, it may bring disaster to mankind .

In 1905, Einstein proposed the mass-energy equation, which revealed the relationship between mass and energy—even small changes in mass can produce huge amounts of energy.

The basic principle of the atomic bomb is to use the mass-energy equation. Einstein once issued a warning to the United States, pointing out that Germany was conducting atomic research, and once Germany successfully developed it, it would pose a serious threat to the world.

In 1945, after the successful development of the United States, two atomic bombs were dropped on Japan, causing more than 200,000 deaths. After the explosion, a large amount of nuclear radiation was released, which had long-term negative effects on human beings, caused cancer and other health problems, and caused serious damage to the ecological environment. The huge loss poses a serious threat to human security.

Therefore, many people have been calling for the prohibition of the use and development of nuclear weapons to avoid major disasters.

insert image description here

At the end of March 2023, the Future of Life Institute of the United States issued an open letter calling on humans to suspend research on artificial intelligence systems that are more powerful than GPT-4 for at least 6 months, lest GPT become too powerful and bring unknown to humans. Danger.

The founder of this Future Life Institute is the famous artificial intelligence researcher Max Tegmark, who is also the author of the book "Life 3.0".

As for what kind of impact GPT will have on human beings in the future, whether it will benefit more or cause more harm, I am afraid that no one knows exactly yet.

I personally feel that GPT is not strong enough to threaten the survival of human beings at present, but using the "six thinking hats" thinking model, thinking about problems from different angles, thinking about potential risks in advance, and taking corresponding preventive measures, this Not a bad thing for us.

We should also learn to use critical thinking. Although the GPT model can help us refine knowledge, summarize experience and guide methods, we still need to make judgments and decisions by ourselves, avoid obvious logical errors, and be responsible for the final result.

4. Summary

Finally, tell a story related to the GPT model.

It is said that in 2021, there is a man named Joshua in the United States. His lover Jessica died of illness, so he feels heartbroken. By chance, he uploaded all the chat records between himself and his lover to the GPT-3 model.

After that, he chatted with GPT-3 when he was free, and a magical thing happened. He felt that Jessica was on the other side of the computer screen, because many details of the chat were too similar to her.

During the chat, Joshua often burst into tears, fell asleep when he was tired from crying, and continued chatting when he woke up. As a result, Joshua was cured. He was no longer as deeply involved and unable to extricate himself as before. Finally, he said: AI resurrected my wife, but I decided to say goodbye to her.

This story inspired me a lot. I think I should keep the habit of recording, write more review summaries, and keep some records, photos, voices, etc. properly. Maybe in the future, I can use the GPT model to communicate with my past self chat .

The more data you record, the more accurate the GPT model will be, and the more real it feels to chat with it. In the future, it may become a kind of emotional sustenance, helping you soothe your emotions, heal your soul, and realize growth empowered by data.

It is said that someone abroad imported his diary into the GPT model, trained a "childhood self", and asked her questions and conversations with her to help him sort out his inner thoughts and effectively solve the problems he encountered.

The principle of the GPT model is actually relatively simple, but only when the data reaches a certain level, quantitative changes will cause qualitative changes. Just like the 10,000-hour law proposed by psychologists, at least 10,000 hours of deliberate practice is required to achieve a professional level in a certain field .

In the end, I believe that if the GPT model is used properly, it will help us better realize our potential and creativity.

Guess you like

Origin blog.csdn.net/qq_32727095/article/details/130078348