torch.nn.parameter is a special type of tensor in PyTorch, which is mainly used to store parameters in neural networks. These parameters can be automatically derived and automatically updated by the optimizer. Tensors defined using torch.nn.Parameter are automatically added to the model's parameter list.
\quad
torch.nn.Parameter
is a subclass torch.Tensor
inherited , and its main function is nn.Module
to be used as a trainable parameter in . torch.Tensor
The difference between it and is that nn.Parameter
the tensor generated by is automatically considered as the trainable parameter of the module, that is, added to parameter()
this iterator; while nn.Parameter()
the normal tensor in the module is not in the parameter.
\quad
Note that the default value of the attributenn.Parameter
of the object is True, that is, it can be trained, which is the opposite of the default value of the object. \quadrequires_grad
torth.Tensor
In nn.Module
the class , also use nn.Parameter
to initialize the parameters of each module. Take nn.Linear as an example:
import torch.nn.Parameter as Parameter
class Linear(Module):
r"""Applies a linear transformation to the incoming data: :math:`y = xA^T + b`
Args:
in_features: size of each input sample
out_features: size of each output sample
bias: If set to ``False``, the layer will not learn an additive bias.
Default: ``True``
Shape:
- Input: :math:`(N, *, H_{in})` where :math:`*` means any number of
additional dimensions and :math:`H_{in} = \text{in\_features}`
- Output: :math:`(N, *, H_{out})` where all but the last dimension
are the same shape as the input and :math:`H_{out} = \text{out\_features}`.
Attributes:
weight: the learnable weights of the module of shape
:math:`(\text{out\_features}, \text{in\_features})`. The values are
initialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where
:math:`k = \frac{1}{\text{in\_features}}`
bias: the learnable bias of the module of shape :math:`(\text{out\_features})`.
If :attr:`bias` is ``True``, the values are initialized from
:math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})` where
:math:`k = \frac{1}{\text{in\_features}}`
Examples::
>>> m = nn.Linear(20, 30)
>>> input = torch.randn(128, 20)
>>> output = m(input)
>>> print(output.size())
torch.Size([128, 30])
"""
__constants__ = ['in_features', 'out_features']
def __init__(self, in_features, out_features, bias=True):
super(Linear, self).__init__()
self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.Tensor(out_features, in_features))
if bias:
self.bias = Parameter(torch.Tensor(out_features))
else:
self.register_parameter('bias', None)
self.reset_parameters()
def reset_parameters(self):
init.kaiming_uniform_(self.weight, a=math.sqrt(5))
if self.bias is not None:
fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
bound = 1 / math.sqrt(fan_in)
init.uniform_(self.bias, -bound, bound)
def forward(self, input):
return F.linear(input, self.weight, self.bias)
def extra_repr(self):
return 'in_features={}, out_features={}, bias={}'.format(
self.in_features, self.out_features, self.bias is not None
)
It can be seen that when Linear is initialized, both weights and bias are generated torch.nn.Parameter()
using :
self.weight = torch.nn.Parameter(torch.Tensor(out_features, in_features))
self.bias = torch.nn.Parameter(torch.Tensor(out_features))