读《Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance》

Language 2023-08-19 04:34:36 views: null

2021

Summary

Dominant multimodal named entity recognition (MNER) models do not take full advantage of the fine-grained semantic correspondence between different modal semantic units, which has the potential to refine multimodal representation learning.

introduction

How to make full use of visual information is one of the core issues of MNER, which directly affects the performance of the model.
Attempts:
(1) Encode the entire image into a global feature vector (Fig. 1(a)), which can be used to enhance each word representation (Moon, Neves, and Carvalho 2018), or guide words to learn visually perceptual representations (Lu 2018; Zhang et al. 2018); （就是节点级分类那种实现方式，比如一张人脸图像整体得到一个嵌入）
(2) segment the entire image into multiple regions evenly (Fig. 1(b)), and interact with text sequences based on the transformation framework (Yu et al. 2020). （就是图级实现的一种方式，类似超像素图块，ZSL还有ViT说的那个patch那种处理）
insert image description here
They do not make full use of the fine-grained semantic correspondence between semantic units in the input sentence-image pair.
For example, a map is implicit global information,
and b map is local information that contains multiple averagely segmented regions, but it is still implicit.

These two kinds of information propagate the cue of "gate" to the textual representation differently. The failure to develop this important thread may be due to two major challenges: 1) how to construct a unified representation to bridge the semantic gap between two different modalities; 2) how to achieve semantic interaction based on the unified representation.

So use c（这种目标检测就有点任务特定了，是图像中明确可以boundingbox的那种）

method

composition

node

Text or words as nodes,
vision is the bounding box

even side

The intra nodes are fully connected, and the inter nodes are connected corresponding to the same thing

fusion

intra self-attention, inter gating（和a novel那篇一毛一样）

Guess you like

Origin blog.csdn.net/weixin_40459958/article/details/123567686

读《Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance》

论文精读：ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

Entity Recognition (2) - Named Entity Recognition Practice CRF

Named Entity Graph Sub-Subgraph

Named entity recognition model with BILSTM + CRF

And that use nltk spacy named entity extraction / recognition

A brief history of the development of named entity recognition (NER)

[NLP] OpenNLP named entity recognition (NameFinder) use

Named entity recognition practice (LSTM+CRF)

Named entity recognition practice (dictionary matching)

NLP (5) Named Entity Recognition (NER)

The second article in the AllenNLP series: Named Entity Recognition

Using BERT for named entity recognition tasks

keras+lstm+crf named entity recognition

NER named entity recognition articles or literature collection

Transformers pre-training model uses: Named Entity Recognition Named Entity Recognition

NLP base - named entity recognition (a) rule-based

Use HIT LTP named entity recognition and text to txt save

Do named entity recognition (IV) with deep learning - training model

Do named entity recognition (five) with deep learning - model uses

Learning to do with the depth of the named entity recognition (six) -BERT Introduction

Survey records of some of the named entity recognition and text summary of NLP

Pytorch - XLNet pre-training model and named entity recognition

Named entity recognition practice (bert + fine-tuning)

Tagging Problems: Part of Speech Tagging (POS) and Named Entity Recognition (NER)

Use crf++ tool for custom domain named entity recognition

Dilated Convolutional Models for Named Entity Recognition: idcnn, idcnn, and bilstm

Fine tune BaiChuan13B for named entity recognition

Hands-on teaching of small-scale financial knowledge map construction: quantitative analysis, graph database neo4j, graph algorithm, relationship prediction, named entity recognition, detailed teaching of Cypher Cheetsheet, etc.

NER named entity recognition, the evaluation level entity-level, precision, recall and F1 value

Recommended

Ranking

Summary of Python file operations

[Android] Four startup modes of Activity

react-redux optimization, container components and UI components are integrated into one file, and the method in the container component is defined as an object

IMv9.0 version summary [server + client], the final version

Tensorflow study notes: two-dimensional logistic regression

auto关键字 decltype关键字

Qpid first lesson compile Windows C ++ / Qpid Client

Lucky viewers randomly selected from a set of code names implemented (Java programming classical case)

Very cool web page style special effects

robotframework+selenium for webui page automation testing

Daily

More

2024-09-05(0)

2024-09-04(0)

2024-09-03(0)

2024-09-02(0)

2024-09-01(0)

2024-08-31(0)

2024-08-30(0)

2024-08-29(0)

2024-08-28(0)

2024-08-27(0)