Text-to-SQL learning and finishing (12) Global-GNN model

Get into the habit of writing together! This is the 16th day of my participation in the "Nuggets Daily New Plan·April Update Challenge", click to view the event details .

paper

  • 标题:Global Reasoning over Database Structures for Text-to-SQL Parsing
  • Conference: EMNLP 2019
  • Link: arxiv.org/abs/1908.11…

Introduction

The previous blog introduced a method of applying graph neural network to Text-to-SQL tasks. This blog will continue to follow the direction of graph neural network and introduce a more optimized and perfect model on GNN- -Gating GCN model.

Introduction

In the Cross-domain setting of the Spider dataset, one of the most critical difficulties is that the parser must map new lexical items to schema constants not observed during training. Previous GNN models dealt with this mainly through a local similarity function between words and schema constants, which considers each word and schema constant individually, ignoring the global consideration of information.

image.png

As shown in the figure above, the word name appears in the Question question, but this word appears in both the singer and song tables. If only the local information is considered, then the probability of the model choosing singer.name and song.name is the same. However, we can observe that name and nation appear at the same time in the question. Considering that nation only has a high similarity with singer.country, we can roughly judge that name refers to the name column in the table of singer.

Based on this idea, this paper proposes the Global GCN model. It can obtain which of the schema constants are helpful for the generation of the final SQL statement according to the global information, which improves the performance.

method

Compared with the GNN model, this model adds two GCN networks: gated GCN and Re-ranking GCN.

  • Gated GCN添加一个新的节点v_global用于做global representation,替换原来输入中的参数 用作Encoder GCN的输入。

  • Re-rank GCN用于对输出的beam进行rerank排序。这样可以确保在Encoder还是Decoder都可以完整获取schema item的信息(用GCN编码的,Encoder看到的是整个图,Decoder设置为子图但也有全局信息 e a l i g n e^{align} )。

image.png

上图为整个框架结构,Gating GCN用来表示当前问句对应的SQL语句会跟哪些schema item相关,计算一个相关分数。接着,使用Encoder GCN计算每个schema item的学习表示,然后Decoder使用该表示来预测K个候选查询。最后,重新排序的GCN只根据选定的DB常数对每个候选项进行评分。虚线和箭头表明,在Decoder输出SQL查询时,没有将梯度从Reranking GCN传播到Decoder。

image.png

Reranking模块用于将Decoder输出beam的所有query进行rerank。与gated GCN不同的是,reranking GCN只输入在预测SQL语句 y ^ \hat{y} 中出现过的schema item的节点的子图和 v g l o b a l v_{global} 节点。由于这样只能捕获选定节点的全局属性,但忽略未选定的和可能相关的节点。后面又计算一个新的向量 e a l i g n e^{align} to record all node information.

experiment

After adding these two modules, the Global GNN model achieved further improvements. The experimental results are shown in the following figure,

image.png

The authors also compare the performance differences on single and multiple tables:

image.png

in conclusion

Global GNN makes up for some shortcomings of the original GNN model, optimizes the consideration of global information in similarity calculation, and achieves further improvements on the Spider dataset. The next blog will introduce you to the RATSQL model.

Guess you like

Origin juejin.im/post/7087001129409576974