torch.autograd.grad求二阶导数

1 用法介绍

pytorch中torch.autograd.grad函数主要用于计算并返回输出相对于输入的梯度总和，具体的参数作用如下所示：

torch.tril(input, diagonal=0, *, out=None) $\longrightarrow$ Tensor

outputs(sequence of Tensor)：表示微分函数的输出

inputs (sequence of Tensor)：表示微分函数的输入

grad_outputs (sequence of Tensor)：表示“向量-雅克比矩阵”的向量

retain_graph (bool, optional)：表示是否需要将计算图释放掉，当计算二阶导数时需要设置为True

create_graph (bool, optional)：表示是否需要将梯度将会加入到计算图中，当计算高阶导数或者其他计算时会将其设置为需要设置为True

allow_unused (bool, optional)：表示是否只返回输入的梯度，而不返回其他叶子节点的梯度

2 实例讲解

以下给出了具体的二阶导数解析解的数学实例

给定一个向量 ${\bf{x}}=(x_1,x_2)^{\top}$ ，可以得到向量 ${\bf{y}}=(y_1,y_2)^{\top}=(x^2_1,x^2_2)^{\top}$ 。对向量 ${\bf{y}}$ 的元素求平均可以得到损失函数 $\mathrm{loss}_1$ 为： $\mathrm{loss}_1({\bf{x}})=\mathrm{mean}({\bf{y}})=\frac{x_1^2+x^2_2}{2}$ 向量 ${\bf{y}}$ 元素的分量分别对 ${\bf{x}}$ 求偏导，然后相加求平均得到损失函数 $\mathrm{loss}_2$ 为 $\left\{\begin{aligned}h_1({\bf{x}})&=\frac{\partial y_1}{\partial {\bf{x}}}=(2x_1,0)^{\top}\\h_2({\bf{x}})&=\frac{\partial y_2}{\partial {\bf{x}}}=(0,2x_2)^{\top}\end{aligned}\right.,\quad \mathrm{loss}_2({\bf{x}})=\mathrm{mean}(h_1({\bf{x}}_1)-h_2({\bf{x}}_2))=x_1-x_2$ 将损失函数 $\mathrm{loss}_1$ 与损失函数 $\mathrm{loss}_2$ 相加可以得到 $\mathrm{loss}({\bf{x}})=\mathrm{loss}_1({\bf{x}})+\mathrm{loss}_2({\bf{x}})=\frac{x_1^2+x_2^2}{2}+x_1-x_2$ 最终损失函数 $\mathrm{loss}$ 对向量 ${\bf{x}}$ 的偏导数为 $\frac{\partial {\mathrm{loss}}}{\partial{ {\bf{x}}}}=(x_1+1,x_2-1)^{\top}$

以下为用pytorch实现二阶导数相对应的代码实例：

import torch

x = torch.tensor([5.0, 7.0], requires_grad=True)
y = x**2

loss1 = torch.mean(y)

h1 = torch.autograd.grad(y[0], x, retain_graph = True, create_graph=True)
h2 = torch.autograd.grad(y[1], x, retain_graph = True, create_graph=True)
loss2 = torch.mean(h1[0] - h2[0])

loss = loss1 + loss2

result = torch.autograd.grad(loss, x)
print(result)

当向量 ${\bf{x}}$ 取值为 $(5,7)^{\top}$ 时，根据数学解析解得到的二阶导数为 $(6,6)^{\top}$ ，对应的代码运行的实验结果也为 $(6, 6)$ 。

torch.autograd.grad求二阶导数

1 用法介绍

2 实例讲解

猜你喜欢