Pytorch神经网络

在学习了jcjohnson的pytorch-examples后，记录一下使用Pytorch编写简单的神经网络的一些体会。

Pytorch Module

在Pytorch中，nn包定义了一系列Modules，这些Modules可以当做神经网络中的基本层，一个Module接收输入Variables然后计算出输出的Variables。
Modules可以包含其他Modules（如torch.nn.Sequential就可以包含torch.nn.Linear），以树状图形式存储，在神经网络中经常使用的Sequential，Linear layers，Convolution layers，Dropout layers等都是Modules的子类。

Pytorch nn

这里，使用Pytorch的nn包实现一个两层的的神经网络。
首先，定义变量的维度。

import torch
from torch.autograd import Variable

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

然后使用随机初始化的Tensors来存储输入，输出，并将其封装为Variables。

# Create random Tensors to hold inputs and outputs, and wrap them in Variables.
x = Variable(torch.randn(N, D_in))
y = Variable(torch.randn(N, D_out), requires_grad=False)

将神经网络的Modules到torch.nn.Sequential容器中，其中每一个Module都通过一个线性函数从输入计算出输出，并将其weight和bias参数保存在中间变量中。

# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. Each Linear Module computes output from input using a
# linear function, and holds internal Variables for its weight and bias.
model = torch.nn.Sequential(
          torch.nn.Linear(D_in, H),
          torch.nn.ReLU(),
          torch.nn.Linear(H, D_out),
        )

之后我们需要通过loss function来评估该模型的学习能力，其中nn包封装了一些常用的loss function，在本例中，使用了最小均方差（MSE）作为loss function。


# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(size_average=False)
learning_rate = 1e-4

上述操作结束后，我们就可以进行参数的学习了。在每次迭代中，首先通过输入计算出输出y_pred，然后计算出预测值与实际值之间的MSE，之后先将参数的梯度置为0，计算出学习参数的梯度后，通过学习率进行参数优化。

for t in range(500):
  # Forward pass: compute predicted y by passing x to the model. Module objects
  # override the __call__ operator so you can call them like functions. When
  # doing so you pass a Variable of input data to the Module and it produces
  # a Variable of output data.
  y_pred = model(x)

  # Compute and print loss. We pass Variables containing the predicted and true
  # values of y, and the loss function returns a Variable containing the loss.
  loss = loss_fn(y_pred, y)
  print(t, loss.data[0])

  # Zero the gradients before running the backward pass.
  model.zero_grad()

  # Backward pass: compute gradient of the loss with respect to all the learnable
  # parameters of the model. Internally, the parameters of each Module are stored
  # in Variables with requires_grad=True, so this call will compute gradients for
  # all learnable parameters in the model.
  loss.backward()

  # Update the weights using gradient descent. Each parameter is a Variable, so
  # we can access its data and gradients like we did before.
  for param in model.parameters():
    param.data -= learning_rate * param.grad.data

上面代码中通过学习率手动更新学习参数值，Pytorch中optim包封装了一些优化算法并且提供了调用接口。下面代码中，通过使用optim包中的Adam算法来优化模型。

learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for t in range(500):
  # Forward pass: compute predicted y by passing x to the model.
  y_pred = model(x)

  # Compute and print loss.
  loss = loss_fn(y_pred, y)
  print(t, loss.data[0])

  # Before the backward pass, use the optimizer object to zero all of the
  # gradients for the variables it will update (which are the learnable weights
  # of the model)
  optimizer.zero_grad()

  # Backward pass: compute gradient of the loss with respect to model parameters
  loss.backward()

  # Calling the step function on an Optimizer makes an update to its parameters
  optimizer.step()

定制 NN Modules

有时我们会想去自己定制一些更加复杂神经网络的层，而不是去使用已有的层。那么就可以自己编写nn.Module的子类，然后定义forward方法，在该方法中，接收输入Variable，执行定义在Module或者Variable上的任意操作，返回输出Variable
下面是一个定制的Module例子

import torch
from torch.autograd import Variable

class TwoLayerNet(torch.nn.Module):
  def __init__(self, D_in, H, D_out):
    """
    In the constructor we instantiate two nn.Linear modules and assign them as
    member variables.
    """
    super(TwoLayerNet, self).__init__()
    self.linear1 = torch.nn.Linear(D_in, H)
    self.linear2 = torch.nn.Linear(H, D_out)

  def forward(self, x):
    """
    In the forward function we accept a Variable of input data and we must return
    a Variable of output data. We can use Modules defined in the constructor as
    well as arbitrary operators on Variables.
    """
    h_relu = self.linear1(x).clamp(min=0)
    y_pred = self.linear2(h_relu)
    return y_pred

在TwoLayerNet中，封装了两层Linear layers，然后在forward方法中执行了定义在Linear和Variable上的常规方法。
之后，就可以使用TwoLayerNet来构造模型。

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs, and wrap them in Variables
x = Variable(torch.randn(N, D_in))
y = Variable(torch.randn(N, D_out), requires_grad=False)

# Construct our model by instantiating the class defined above
model = TwoLayerNet(D_in, H, D_out)

# Construct our loss function and an Optimizer. The call to model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
criterion = torch.nn.MSELoss(size_average=False)
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
for t in range(500):
  # Forward pass: Compute predicted y by passing x to the model
  y_pred = model(x)

  # Compute and print loss
  loss = criterion(y_pred, y)
  print(t, loss.data[0])

  # Zero gradients, perform a backward pass, and update the weights.
  optimizer.zero_grad()
  loss.backward()
  optimizer.step()

Pytorch Module

Pytorch nn

定制 NN Modules

猜你喜欢