Batch normalization Brief
What is batch normalization
Source: https://www.bilibili.com/video/av15997678/?p=34
BATCH popular for normalization, each layer is to perform normalization process, rather than the input data.
So that the data distribution is more uniform within the range of activation of the activation function more efficiently transmitted forward
But normalization is not necessarily valid, we can allow themselves to machine learning, plus without normalization to see which is more effective:
As shown, the last we add a layer of anti-normalization can play a role, two parameters can be obtained by learning. If the result of machine learning is the normalization does not have a positive impact on the result, you can offset the impact of Normalization by adjusting these two parameters
FIG attached to a normalization of the effect of:
Know almost this answer on the role of the batch normalization have a more in-depth explanation, here I quote the content that concluding:
In BN, the specification is such that the activation by the activation of a reduced scale would otherwise consistent with mean and variance means becomes larger. It can be said is a more effective local response normalization method