Macro Discourse Relation Recognition via Discourse Argument Pair Graph reading notes

[标题]
《Macro Discourse Relation Recognition via Discourse Argument Pair Graph》

[ Code Address ]
None

[ Knowledge Reserve ]

1. Background and overview

1.1 Related research

no

1.2 Contribution points

  • First use of gnn in Chinese text relationship recognition
  • Good performance

1.3 Related work

no

Two, the model

2.0 General

Summarize the mapping and model.

  • argument-word: the keyword information TF-IDF, a priori attention information
  • word-word: the global information PMI, subject coherence between sentences
    Barabara.

2.1 Build a map

2.1.0 Node representation

Build a graph on the entire corpus, which contains all argument nodes and word nodes.
Using word2vec as a word vector can alleviate the cold start problem and bring more precise word semantic information .
The argument uses the average of word vectors.

2.1.1 Connecting edges

Word-word: Using the PMI indicator, a positive PMI value represents a higher semantic connection between words.
Insert picture description here
Word-sentence: TF is the frequency of words appearing in sentences, IDF is the
frequency of the inverse document after log normalization (? IDF is the frequency of the inverse document after log normalization).

Self-loop: Not only learn the new, but also keep the old.

2.1.2 Graph construction

Insert picture description here

2.2 Model

Insert picture description here

2.0 Input layer

A and H 0 H^0 H0

2.1 Coding layer

After the first layer of convolutional network, the sentence aggregates the words connected to it; the word aggregates the words connected to it.
After the second layer of convolutional network, the sentence aggregates the global semantic information brought by the "words connected to it".

2.2 Classification layer

It turns out that an argument is a paragraph, there are multiple sentences, first concat each to get H arg 1 H_{arg1}Harg1And H arg 2 H_{arg2}Harg2, Then concat to get HHH , then classify.
Cross entropy:
Insert picture description here

Three, experiment and evaluation

Benchmark model:

  • LSTM
  • MSRM: Utilizing global information, but ignoring the inconsistency of important words in sentences.
  • STGSN: The sequence model cannot well capture the intra-sentence dependencies of long texts; it is not good for long text attention; it ignores global information.

Four, ablation experiment

Remove ww edge: no PMI—>1?
Remove wo edge: no TFIDF—>the ​​weight of each argument for each of its words=1/length

5. Conclusion and personal summary

The obtained sentence vector representation may be transferred to other tasks
. How to better model the future work

Guess you like

Origin blog.csdn.net/jokerxsy/article/details/114022266