Keywords: pre-trained model, encoder-decoder, selfattention, AdamW, supervisory signal, deep learning, NLP

Author: Zen and the Art of Computer Programming

1 Introduction

And background introduction Natural Language Processing (Natural Language Processing, NLP) is one of the important branches of machine learning, computer vision and other fields. With the popularity of the Internet, more and more application scenarios require the ability to understand and process human speech information. The important role of deep learning technology in NLP tasks has become increasingly prominent, mainly including the following two aspects:

  1. Text Classification, Sentiment Analysis, Text Generation, Dialogue Systems, Search Engines
  2. Named entity recognition, relation extraction, event extraction, text summarization, machine translation, question answering system

In this context, in order to enable the deep learning model to achieve better results on these tasks, some technological breakthroughs have also emerged, such as pre-training models, encoder-decoder models, self-attention mechanisms, optimization Machine (Adam W). This article will introduce its principle and implementation in detail from these technical levels.

2. Explanation of basic concepts and terms

First of all, we need to understand the basic concepts and terms related to NLP. The NLP tasks we use all boil down to sequence labeling problems. In general, a sequence labeling problem consists of an input sequence X and an output sequence Y, where each element is a token or label. For example, for sentence-level tasks, X is the input sentence, Y is the part-of-speech tag of each word in the sentence; for document-level tasks, X is a piece of text, and Y is each sentence in the document. Sequence labeling problems usually require learning the mapping relationship between input sequences and output sequences.

Here, we also need to clarify the following basic terms:

  1. Tokenization: Split a piece of text into a set of words or symbols consisting of single or multiple symbols. For example, after tokenization of the English text, a token list such as ["the", "cat", "jumps", "over"] may be obtained.

Guess you like

Origin blog.csdn.net/universsky2015/article/details/132364003