Representation of cross entropy in two-class and multi-class softmax layer

The logistic function transforms the input value of the upper layer to the value range [0,1], which can be used as the excitation function of the output layer.

But when you encounter multi-classification problems, you need to use the softmax function, and check the specific reasons yourself.

sequence

Cross-entropy loss is a commonly used loss function in classification tasks, but have you noticed the difference in the form of cross-entropy in the case of two-class and multi-class?
This time record the difference between the two.

Two forms

Both of these are cross-entropy loss functions, but they seem to be very different. Why is it the same cross-entropy loss function, but the length is different?

Because these two cross-entropy loss functions correspond to different outputs of the last layer: the
first corresponding to the last layer is softmax, and the second corresponding to the last layer is sigmoid.

Cross Entropy in Information Theory

Let's first look at the form of cross entropy in information theory

 

 

Cross entropy is used to describe the distance between two distributions. The purpose of neural network training is to make g(x) approach p(x).

Cross entropy of softmax layer

What is g(x)? It is the output y of the last layer.
What is p(x)? It is our one-hot label. We bring into the definition of cross entropy and calculate, we will get the first formula:

where j represents that the sample x belongs to the jth category.

sigmoid as output cross entropy

If sigmoid is used as the output of the last layer, then the output of the last layer cannot be regarded as a distribution, because it does not add up to 1.

Now each neuron in the last layer should be regarded as a distribution, and the corresponding target belongs to the binomial distribution (the value of target represents the probability of this class), then the cross entropy of the i-th neuron is:

 

 

So the total cross-entropy loss function of the last layer is:

 

 

Guess you like

Origin blog.csdn.net/a493823882/article/details/104215395