simple bert model for relation extraction short text

Simple BERT Models for Relation Extraction and Semantic Role Labeling

1 Paper motivation

  • Bert proposed based model to relation extraction (Relation Extraction) and semantic role labeling (Semantic Role Labeling)
  • It does not require binding characteristics of vocabulary and syntax, reaching SOTA performance, providing Baseline for the follow-up study

2 Model Introduction

2.1 relationship extraction model

Model schematic relation extraction, as shown:

The input sentence is configured to: [[CLS] sentence [SEP] subject [SEP] object [SEP]]

To prevent over-fitting, using a special mask token for the subject entity of the sentence and the object entity, for example, [S-PER] represented representative subject entity. After the sentence after the word WordPiece Mask after segmentation is input to the encoder Bert

Use [official]representing words between [[CLS] sentence [SEP] ] represented by the vector obtained Bert, here [official]is not necessarily the length of the sentence, since the word may be divided into several sub-word word will

Use [official]vector representation of an entity subject

Use [official]vector object representation of an entity

Define the position of sequences relative to the subject entity is [official]:

[official]

In the formula, [official]and [official]are based language start and end location of the entity, [official]represents the relative position and subject entity

Likewise, due to the position of the object entity is the sequence[official]

The switching position is a position vector sequence, and a vector representing Bert [official]stitching, as shown in (a),

The vector sequence is then input to a Bi-LSTM, Get last status hidden layer in each direction

A single input to the neural network hidden layer is the prediction relationship

2.2 semantic role annotation model

Model schematic of semantic parsing, as shown:

2.2.1 Predicate sense disambiguation, predicate meaning disambiguation

This task will be processed as a sequence labeling, sentence after WordPiece word breaker word, any word of a token is labeled as O, the rest of the token is labeled X. The vector can be expressed as Bert [official], and predicates indicator embedded spliced prediction is classified after single hidden layer neural network

2.2.2 Argument identification and classification, identification and classification of arguments

Model structure shown above, the input sequence is [[CLS] sentence [SEP] predicate [SEP]], obtained by the vector representing Bert and embedding the indicator by stitching after the single Bi-LSTM give each word sequence hidden layer is expressed as [official], for predicting the word vector representation [official], and a token representation of each vector [official]to splice, to enter a single hidden layer of the neural network to classify forecast

3 Experimental performance

Relationship extraction model on TACRED dataset comparison of different models and indicators as shown:

Semantic role labeling model on CoNLL 2009 and out-of-domain data sets compare different models and indicators as shown:

Guess you like

Origin www.cnblogs.com/chenyusheng0803/p/12592775.html