Pytorch中的损失函数：CrossEntropyLoss和NLLLoss的区别

在网上也浏览了很多的博客，很多博主都从熵和信息量开始介绍，因为最近在使用损失函数过程中遇到一点问题，因此就想一探究竟到底pytorch中的交叉熵损失函数是如何计算的，所有最简单的从公式和代码开始看。接下来直接介绍这两个函数：

1、NLLLoss

class torch.nn.NLLLoss(weight=None, size_average=True)

作用：训练一个n类的分类器
参数
- weight：可选的，应该是一个tensor，里面的值对应类别的权重，如果样本不均衡的话，这个参数非常有用，长度是类别数目
- szie_average：默认是True，会将mini-batch的loss求平均值；否则就是把loss累加起来
loss
- loss = nn.NLLLoss()
- loss（x，target）=-x[class]
x和target的形状
- x：（N,C）N代表BatchSize的大小，C代表类别个数
- target：长度为n的tensor，而且0 <= targets[i] <= C-1

# input is of size N x C = 3 x 5
input = torch.randn(3, 5, requires_grad=True)
# each element in target has to have 0 <= value < C
target = torch.tensor([1, 0, 4])
output = loss(m(input), target)

#原始输入数据
print('input:',input)
input: tensor([[-0.0771,  1.4827, -0.8429, -1.1410,  1.4243],
        [-2.8930,  2.2326,  1.6176,  1.1098,  1.4897],
        [-0.9037, -1.7182,  1.6025, -0.7977, -2.6082]], requires_grad=True)

#softmax之后
print('softmax:',s(input))
softmax: tensor([[0.0905, 0.4304, 0.0421, 0.0312, 0.4059],
        [0.0025, 0.4259, 0.2303, 0.1386, 0.2026],
        [0.0667, 0.0295, 0.8175, 0.0741, 0.0121]], grad_fn=<SoftmaxBackward>)

#对softmax的结果取对数
print('log:',torch.log(s(input)))
log: tensor([[-2.4029, -0.8432, -3.1688, -3.4668, -0.9016],
        [-5.9790, -0.8534, -1.4684, -1.9762, -1.5963],
        [-2.7078, -3.5222, -0.2015, -2.6018, -4.4122]], grad_fn=<LogBackward>)

#得到LogSoftmax的结果
print('input_log:',m(input))

input_log: tensor([[-2.4029, -0.8432, -3.1688, -3.4668, -0.9016],
        [-5.9790, -0.8534, -1.4684, -1.9762, -1.5963],
        [-2.7078, -3.5222, -0.2015, -2.6018, -4.4122]],

#损失函数的结果
print('output:',output)
output: tensor(3.7448, grad_fn=<NllLossBackward>)
#对三个值加起来求平均值(0.8432+5.9790+4.4122)/3

这个损失函数的数据流程是：输入原始数据->取softmax->取log->进行loss的计算，进行loss计算的时候也就是把input_log中对应的数据加起来然后除mini-batch。

有一点值得注意：python中的 $log^{x}=log{e}^{x}$

2、CrossEntropyLoss

class torch.nn.CrossEntropyLoss(weight=None, size_average=True)

作用：训练一个多类别分类器
本质：将LogSoftmax和NLLLoss集成到一个类别中去
参数：同上，不赘述
Loss
- loss = nn.NLLLoss()
x和target形状：同上，不赘述

import torch.nn as nn
import torch
import torch.nn.functional as F


loss_cross = nn.CrossEntropyLoss()

# input is of size N x C = 3 x 5
input = torch.randn(3, 5, requires_grad=True)
# each element in target has to have 0 <= value < C
target = torch.tensor([1, 0, 4])
loss = loss_cross(input,target)

#loss: tensor(3.7448, grad_fn=<NllLossBackward>)

3、torch.nn和torch.nn.functional的区别

torch.nn.functional.nll_loss(input, target, weight=None, size_average=True)
class torch.nn.NLLLoss(weight=None, size_average=True)

区别很明显，一个直接可以调用。一个先声明一个对象，然后才能调用；但是不管怎么样结果都是相同的。

总结：

	差异	相同
CrossEntropyLoss	函数内部会对数据做处理	结果相同
NllLoss	输入数据做LogSoftmax处理	结果相同

Pytorch中的损失函数：CrossEntropyLoss和NLLLoss的区别

1、NLLLoss

有一点值得注意：python中的

2、CrossEntropyLoss

3、torch.nn和torch.nn.functional的区别

总结：

猜你喜欢

有一点值得注意：python中的 $log^{x}=log{e}^{x}$