Summary depth study on the [translation] Figure V Figure from the encoder

Deep Learning on Graphs: A Survey

Arxiv 1812.04202

Represents a node encoder of FIG. Since (AE) and its variants are widely used for unsupervised learning [74], it is suitable for unsupervised learning information. In this section, we will first introduce FIG from the encoder, and then turning to FIG variation from the encoder, and other improvements. Table 4 summarizes the main features of GAE surveyed.

Table 4: Comparison of different figures from the encoder (GAE) of

5.1 from the encoder

FIG AE usage from the sparse from the encoder (SAE) [75] ^ 3. The basic idea is that, by the considered node adjacency matrix, or a debate original features, be utilized as a technique of reducing dimension AE to study low dimensional node represents. Specifically, SAE L2 reconstruction following losses:

Where Pis the transition matrix P^is a matrix reconstruction, h[i] ∈ R^dis a node v[i]of low-dimensional representation, F(·)an encoder, G(·)a decoder, d << Na dimension, and Θa parameter. The encoder and decoder both having a plurality of hidden layer multilayer perceptron. In other words, SAE trying to P[i, :]messages into low-dimensional vector h[i]and reconstruct the original vector. SAE also adds another sparse regularization term. In the low-dimensional representation is obtained h[i]after, k-means [85] is applied to node clustering task, according to a non-depth study to prove that it is superior to the baseline experience. However, since the theory is not correct, the effectiveness of such mechanisms behind remains unexplained.

Structured depth embedded network (SDNE) [76] show through, L2 reconstruction equation 35 is actually a loss corresponding to the second order proximity fill this problem. I.e. if they have a similar neighborhood of the two nodes share similar embedding said it performed well in the network science research, such as collaborative filtering or triangular closure [5]. Embedding method inspired by the network, indicating a near-order is also very important [86], SDNE be modified objective function [54] by adding an entry to another similar Laplacian mapping features:

That is, if two nodes are directly connected, they have to share a similar embedding FIG. On further weight loss by modifying L2 reconstructed using the adjacency matrix and a zero and non-zero elements are assigned different weights:

Where if A(i, j)= 0it is b[ij] = 1, otherwise bij = β > 1, βit is another super parameters. SDNE overall structure as shown in FIG.

FIG 7: SDNE reprinted with permission from the frame [76]. Using the depth from the encoder of the first and second stages neighboring nodes reserved.

Another series inspired by the work of contemporary work DNGR [77] The transfer matrix of Equation 35 P, is replaced with a positive point-wise mutual information of a random walk probability (PPMI) [58] matrix. In this manner, the original features may be associated with some probability that a random walk with pattern [87]. However, the structure of the input matrix may take O(N^2)time complexity, which can not be extended to large-scale map.

GC-MC [78] by using a GCN in [36] as an encoder, using a further different method from the encoder:

The decoder is a simple bilinear function:

Which Θ[de]is a parameter encoder. In this manner, nodes may be combined with features naturally. For FIG no node features, you may use a single coding node of heat. The authors demonstrated the effectiveness of the GC-MC recommended bipartite graph problem.

DRNE [79] is not the adjacency matrix, or a change in the reconstruction, but the proposed another modification, i.e. to reconstruct the low dimensional vectors node directly by using a polymerization LSTM neighborhood information. Specifically, DRNE minimizes the following objective function:

Because LSTM need to enter the sequence, the authors recommended that sort of neighborhood nodes according to degrees. For a node having a degree larger, also used to prevent sampling neighbor memory is too large. The authors demonstrated that this method may retain many of the conventional measure of equivalence and the central node, such as PageRank [88].

与先前将节点映射到低维向量的工作不同,Graph2Gauss(G2G)[80] 建议将每个节点编码为高斯分布h[i] = N (M[i, :], diag (Σ[i, :])),来捕获节点的不确定性。 具体来说,作者使用从节点属性到高斯分布的均值和方差的深度映射作为编码器:

其中F[M](·)F[Σ](·)是需要学习的参数化函数。 然后,他们使用成对约束来学习模型,而不是使用显式解码器函数:

其中d(i, j)是从节点v[i]v[j]的最短距离,KL[q(·) || p(·)]q(·)p(·)之间的 KL 散度 [89]。 换句话说,约束确保节点对之间的 KL 散度,具有与图距离相同的相对顺序。 但是,因为公式 42 难以优化,基于能量的损失 [90] 被用作松弛:

其中D = {(i, j, j0)|d(i, j) < d(i, j0)}Eij = KL(hj||hi)。 它进一步提出了一种无偏差的抽样策略,以加速训练过程。

5.2 变分自编码器

与以前的自编码器相反,变分自编码器(VAE)是另一种深度学习方法,将降维与生成模型相结合 [91]。 在 [81] 中首次将 VAE 引入到建模图数据中,其中解码器是一个简单的线性乘积:

其中h[i]假设遵循高斯后验分布q (h[i]|M, Σ) = N (h[i]|M[i, :], diag (Σ[i, :]))。 对于均值和方差矩阵的编码器,作者采用 [36] 中的 GCN:

然后,可以通过最小化变分下界来学习模型参数 [91]:

但是,由于需要重建整图,因此时间复杂度为O(N^2)

受 SDNE 和 G2G 的启发,DVNE [82] 通过将每个节点表示为高斯分布,为图数据提出了另一种 VAE。 与先前采用 KL 散度作为测量的工作不同,DVNE 使用 Wasserstein 距离 [92] 来保持节点相似性的传递。 与 SDNE 和 G2G 类似,DVNE 还再目标函数中保留的一阶和二阶邻近度:

其中E[ij] = W[2](h[j] || h[i])是两个高斯分布h[j]h[i]之间的第二个 Wasserstein 距离,D = {(i, j, j')|j ∈ N (i), j' ∈/ N (i)}是所有三元组的集合,对应于一阶邻近度的排名损失。 重建损失定义为:

其中P是过渡矩阵,Z是从H中抽取的样本。框架如图 8 所示。然后,目标函数可以使用重新参数化技巧,最小化为常规 VAE [91]。

5.3 改进和讨论

除了这两个主要类别外,还有一些值得讨论的改进。

5.3.1 对抗训练

对抗性训练方案,尤其是生成性对抗性网络(GAN),近来一直是机器学习的热门话题 [93]。 GAN 的基本思想是建立两个链接模型,一个判别器和一个生成器。 生成器的目标是通过生成伪数据来“欺骗”判别器,而判别器旨在区分样本是来自真实数据还是生成器生成。 然后,两个模型可以通过使用 minimax 游戏的联合训练相互受益。

[83] 中,对抗训练方案被纳入 GAE,作为一个额外的正则化项。 总体结构如图 9 所示。具体地说,编码器用作生成器,判别器旨在区分潜在表示是来自生成器还是来自先验分布。 以这种方式,强制自编码器将先验分布与正则化相匹配。 目标函数是:

其中L2类似于在 VAE 或 GAE 中定义的重建损失,而L[GAN]

其中G(F^V, A)是方程 45 中的卷积编码器,D(·)是具有交叉熵损失的鉴别器,p[h]是先验分布。本文采用简单的高斯先验,实验结果证明了对抗训练方案的有效性。

5.3.2 归纳学习和 GCN 编码器

与 GCN 类似,如果节点属性包含在编码器中,则可以将 GAE 应用于归纳设置。 这可以通过使用 GCN 作为编码器来实现,如 [78],[81],[83],或直接从 [80] 中的特征学习映射函数。 由于边信息仅用于学习参数,因此该模型可以应用于训练期间未见的节点。 这些工作还表明,尽管 GCN 和 GAE 基于不同的架构,但可以结合使用它们,我们相信这是一个充满希望的未来方向。

5.3.3 相似性度量

在 GAE 中,采用了许多相似性度量,例如,L2 重建损失,拉普拉斯特征映射和 AE 的排序损失,以及 VAE 的 KL 散度和 Wasserstein 距离。 尽管这些相似性度量基于不同的动机,但如何为给定任务和架构选择适当的相似性度量仍不清楚。 需要进行更多研究来了解这些指标之间的潜在差异。

Guess you like

Origin www.cnblogs.com/wizardforcel/p/11227886.html