神经网络知识点1 - BP反向传播

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/a786150017/article/details/82953158

BP反向传播

基本原理

利用输出后的误差来估计输出层前一层的误差,再用这个误差估计更前一层的误差,如此一层一层地反传下去,从而获得所有其他各层的误差

对网络的连接权重做动态调整
核心:梯度下降法

推导过程

输入层相关变量:下标i
隐藏层相关变量:下标h
输出层相关变量:下标j
激励函数输入为a, 激励函数输出为z, 结点误差为δ
预测值是z, 目标值是t

BP反向传播
【前向传播】
a h = i w i h x i + θ h z h = f ( a h ) {a_h} = \sum\limits_i {{w_{ih}}{x_i} + {\theta _h}} \quad\quad{z_h} = f\left( {{a_h}} \right)
a j = h w h j z h + θ j z j = f ( a j ) {a_j} = \sum\limits_h {{w_{hj}}{z_h} + {\theta _j}} \quad\quad{z_j} = f\left( {{a_j}} \right)

损失函数
E ( W ) = 1 2 j ( t j z j ) 2 E\left( W \right) = {1 \over 2}\sum\limits_j {{{\left( {{t_j} - {z_j}} \right)}^2}}
【反向传播】

链式法则,误差 × 输入

隐藏层-输出层
E w h j = δ j z h {{\partial E} \over {\partial {w_{hj}}}} = {\delta _j}{z_h}
E θ j = δ j {{\partial E} \over {\partial {\theta _j}}} = {\delta _j}
δ j = E a j = E z j z j a j = ( t j z j ) f ( a j ) {\delta _j} = {{\partial E} \over {\partial {a_j}}} = {{\partial E} \over {\partial {z_j}}}{{\partial {z_j}} \over {\partial {a_j}}} = - \left( {{t_j} - {z_j}} \right)f'\left( {{a_j}} \right)
输入层-隐藏层
E w i h = δ h x i {{\partial E} \over {\partial {w_{ih}}}} = {\delta _h}{x_i}
E θ h = δ h {{\partial E} \over {\partial {\theta _h}}} = {\delta _h}
δ h = E a h = j E a j a j z h z h a h = j δ j w h j f ( a h ) {\delta _h} = {{\partial E} \over {\partial {a_h}}} = \sum\limits_j {{{\partial E} \over {\partial {a_j}}}{{\partial {a_j}} \over {\partial {z_h}}}{{\partial {z_h}} \over {\partial {a_h}}}} = \sum\limits_j {{\delta _j}{w_{hj}}f'\left( {{a_h}} \right)}
【更新权重】

Δ w = α E w w t + 1 = w t Δ w \Delta w = \alpha {{\partial E} \over {\partial w}}\quad\quad{w^{t + 1}} = {w^t} - \Delta w

Δ θ = α E θ θ t + 1 = θ t Δ θ \Delta \theta = \alpha {{\partial E} \over {\partial \theta }}\quad\quad{\theta ^{t + 1}} = {\theta ^t} - \Delta \theta

猜你喜欢

转载自blog.csdn.net/a786150017/article/details/82953158