【Recurrent Neural Network Regularization】读后感(未编辑完毕)

简介

正则化(Regularization)使神经网络应用广泛。对前馈神经网络,dropout是最有效的正则化方法。但是dropout不适用于RNN,因为递归(recurrence)会放大噪音,持续影响网络学习。RNN通常使用small model,large RNN趋于overfit。

这篇文章主要贡献是使用dropout在non-recurrent connections。如下图所示。在虚线箭头处应用dropout,在实线箭头处不应用。

1、LSTM units 长短记忆单元

首先RNN可以用表示为:



LSTM具体可以看这篇文章【Graves.Generating sequences with recurrent neural networks(2013)】


long-short term memory units(LSTM)包括输入门i (Input gate),输出门o(Output gate),遗忘门f(Forget gate),Cell等。

The “long term” memory is stored in a vector of memory cells。

公式为:


本文简化为:其中符号“圆圈内加一点”表示Hadamard product,即点乘。


而本文对公式的改进如下:


dropout操作给units加噪,使它们的intermediate computations 更鲁棒。我们不需要删除所有的units信息,因为units记录过去很多timesteps的events。以下是dropout实现用timestep t-2去预测timestep t+2

We can see that the information is corrupted by the dropout operator exactly L+1 times,and this number is independent of the number of timesteps traversed by the information.  Standard dropout perturbs the recurrent connections, which makes it difficult for the LSTM to learn to store information for long periods of time.






猜你喜欢

转载自blog.csdn.net/ciyiquan5963/article/details/77938277