从LeNet-5 CNN模型入门PyTorch

1. PyTorch 准备

最好安装PyTorch独立环境,和其他深度学习框架的环境分开。具体安装细节,官方网站最全。
https://pytorch.org/get-started/previous-versions/
注:安装的时候,将torchvision包也一起安装了。

1.1 PyTorch特点

  1. 支持强大的GPU计算;
  2. 支持动态定义计算图;
  3. 提供强大的灵活性和速度。

1.2 PyTorch安装测试

安装,请参考pytorch官网的指导。
测试:

import torch
torch.cuda.is_available()
torch.cuda.get_device_name(0)
torch.rand(3,3).cuda()

2. 完整代码

完整代码,从两处分享而来,感谢两位作者:

  • 模型和训练
    github链接–https://github.com/ChawDoe/LeNet5-MNIST-PyTorch
  • 测试
    https://blog.csdn.net/u014453898/article/details/90707987

2.1 LeNet模型

前向传播,预测分类结果

#model.py
from torch.nn import Module
from torch import nn


class Model(Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(2)
        self.fc1 = nn.Linear(256, 120)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(120, 84)
        self.relu4 = nn.ReLU()
        self.fc3 = nn.Linear(84, 10)
        self.relu5 = nn.ReLU()

    def forward(self, x):
        y = self.conv1(x)
        y = self.relu1(y)
        y = self.pool1(y)
        y = self.conv2(y)
        y = self.relu2(y)
        y = self.pool2(y)
        y = y.view(y.shape[0], -1)
        y = self.fc1(y)
        y = self.relu3(y)
        y = self.fc2(y)
        y = self.relu4(y)
        y = self.fc3(y)
        y = self.relu5(y)
        return y

2.2 训练

  • 反向传播,迭代更新更优参数
  • 测试集验证每批处理下的参数 准确率
  • 将此准确率下的模型保存到pt文件中
#train.py
from model import Model
import numpy as np
import torch
import os
import stat
from torchvision.datasets import mnist
from torch.nn import CrossEntropyLoss
from torch.optim import SGD
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor

if __name__ == '__main__':
    batch_size = 256
    train_dataset = mnist.MNIST(root="../src/pytorch/LeNet5-MNIST-PyTorch-master/train",
                                train=True,
                                transform=ToTensor(),
                                download=True)
    test_dataset = mnist.MNIST(root='../src/pytorch/LeNet5-MNIST-PyTorch-master/test',
                               train=False,
                               transform=ToTensor(),
                               download=True)
    train_loader = DataLoader(train_dataset, batch_size=batch_size)
    test_loader = DataLoader(test_dataset, batch_size=batch_size)
    model = Model()
    sgd = SGD(model.parameters(), lr=1e-1)
    cross_error = CrossEntropyLoss()
    epoch = 100

    for _epoch in range(epoch):
        #反向传播,迭代更新更优参数
        for idx, (train_x, train_label) in enumerate(train_loader):
            label_np = np.zeros((train_label.shape[0], 10))
            sgd.zero_grad()
            predict_y = model(train_x.float())
            _error = cross_error(predict_y, train_label.long())
            if idx % 10 == 0:
                print('idx: {}, _error: {}'.format(idx, _error))
            _error.backward()
            sgd.step()

        correct = 0
        _sum = 0
        #测试集验证每批处理下的参数 准确率
        for idx, (test_x, test_label) in enumerate(test_loader):
            predict_y = model(test_x.float()).detach()
            predict_ys = np.argmax(predict_y, axis=-1)
            label_np = test_label.numpy()
            _ = predict_ys == test_label
            correct += np.sum(_.numpy(), axis=-1)
            _sum += _.shape[0]
  
            print('accuracy: {:.2f}'.format(correct / _sum))
            #将此准确率下的模型保存到pt文件中
            torch.save(model, '../src/pytorch/LeNet5-MNIST-PyTorch-master/models/mnist_{:.2f}.pt'.format(correct / _sum))

2.2 测试

  • 加载模型文件(可选:指定CPU or GPU,device() to())
    model = torch.load(’…/src/pytorch/LeNet5-MNIST-PyTorch-master/models/mnist_0.90.pt’)
    可选:
    device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
    model = model.to(device)
  • 测试图像的处理
    需要将图片处理成模型要求的图像数据。pytorch有专门的类接口实现。例如 torchvision.transforms.ToSensor、torchvision.transforms.Normalize等。
    (1) ToSensor,是将PIL图像数据转换为PyTorch的Tensor格式。
    (2) Normalize,标准化PIL图像数据,通过均值和标准差。
    另外,在维度上需要和模型一致。此处使用unsqueeze接口。原PIL图是(1,28,28),需要带batch_size的维度扩展(1, 1, 28, 28)
  • 处理后的图像数据,带入训练好的模型中,得到分类结果
    模型得出的结果格式是pytorch的,可以转换为numpy。可选操作:
    (1) torch.nn.functional.softmax
    (2) torch.autograd.Variable
    (3) torch.Tensor.numpy
#test.py
import torch
import cv2
from PIL import Image
import os
import torch.nn.functional as F
from torch.autograd import Variable
from torchvision import datasets, transforms
import numpy as np
 
if __name__ =='__main__':
    test_path = "../datasets/MNIST_PNG_Data/test/"
    for num in range(0,10):
        file_path = os.path.join(test_path, str(num))
        print('file_path: ', file_path)
        acc_count = 0
        sample_count = 0
        for file in os.listdir(file_path):
            file_name = os.path.join(file_path, file)
            device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
            model = torch.load('../src/pytorch/LeNet5-MNIST-PyTorch-master/models/mnist_0.99.pt') #加载模型
            model = model.to(device)
            model.eval()    #把模型转为test模式
        
            # img = cv2.imread("00030.png")  #读取要预测的图片
            img = Image.open(file_name)
            trans = transforms.Compose(
                [
                    transforms.ToTensor(),
                    transforms.Normalize((0.1307,), (0.3081,))
                ])
        
            # img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)#图片转为灰度图,因为mnist数据集都是灰度图
            img = trans(img)
            img = img.to(device)
            img = img.unsqueeze(0)  #图片扩展多一维,因为输入到保存的模型中是4维的[batch_size,通道,长,宽],而普通图片只有三维,[通道,长,宽]
            #扩展后,为[1,1,28,28]
            output = model(img)
            #output:  tensor([[ 0.0000, 32.1585, 41.3743,  0.0000,  0.0000,  2.3812,  0.0000,  7.8895,
            #   0.0000,  0.0000]], grad_fn=<ReluBackward0>)
            prob = F.softmax(output, dim=1)
            #prob after output softmax:  tensor([[1.0748e-18, 9.9439e-05, 9.9990e-01, 1.0748e-18, 1.0748e-18, 1.1626e- 
            #   17,1.0748e-18, 2.8688e-15, 1.0748e-18, 1.0748e-18]],grad_fn=<SoftmaxBackward>)
            prob = Variable(prob)
            #prob after variable prob:  tensor([[1.0748e-18, 9.9439e-05, 9.9990e-01, 1.0748e-18, 1.0748e-18, 1.1626e-
            #   17,1.0748e-18, 2.8688e-15, 1.0748e-18, 1.0748e-18]])
            prob = prob.cpu().numpy()  #用GPU的数据训练的模型保存的参数都是gpu形式的,要显示则先要转回cpu,再转回numpy模式
            # print(prob)  #prob是10个分类的概率
            pred = np.argmax(prob) #选出概率最大的一个
            sample_count += 1
            if pred.item() == num:
                acc_count += 1
        print("{} count: {}, classify error count: {}, accuracy: {:.6f}".format(num, sample_count,acc_count, acc_count/sample_count))

测试结果打印:测试过程速度很快,见证了强大速度的特点。速度上和Tensorflow形成明显对比。

file_path:  ../datasets/MNIST_PNG_Data/test/0
0 count: 100, classify error count: 100, accuracy: 1.000000
file_path:  ../datasets/MNIST_PNG_Data/test/1
1 count: 100, classify error count: 99, accuracy: 0.990000
file_path:  ../datasets/MNIST_PNG_Data/test/2
2 count: 1032, classify error count: 1028, accuracy: 0.996124
file_path:  ../datasets/MNIST_PNG_Data/test/3
3 count: 1010, classify error count: 997, accuracy: 0.987129
file_path:  ../datasets/MNIST_PNG_Data/test/4
4 count: 982, classify error count: 974, accuracy: 0.991853
file_path:  ../datasets/MNIST_PNG_Data/test/5
5 count: 892, classify error count: 881, accuracy: 0.987668
file_path:  ../datasets/MNIST_PNG_Data/test/6
6 count: 958, classify error count: 946, accuracy: 0.987474
file_path:  ../datasets/MNIST_PNG_Data/test/7
7 count: 1028, classify error count: 992, accuracy: 0.964981
file_path:  ../datasets/MNIST_PNG_Data/test/8
8 count: 974, classify error count: 963, accuracy: 0.988706
file_path:  ../datasets/MNIST_PNG_Data/test/9
9 count: 1009, classify error count: 984, accuracy: 0.975223

3. 解读PyTorch

从上面的Mnist LeNet-5模型示例中,我们学习PyTorch的基本语法、相关API等。
神经网络训练时,Pytorch一般的步骤是:
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
>>> optimizer.zero_grad()
>>> loss_fn(model(input), target).backward()
>>> optimizer.step()

  • 模型梯度(包括清零)–zero_grad()
  • 损失函数反向传播–backward()
  • 梯度迭代,更新模型参数值–step()

3.1 CNN Api

在本模型中,用到了二维卷积Conv2d、激活函数ReLU、池化层函数MaxPool2d、全连接层Linear等。另,Module Class is Base class for all neural network modules。

    r"""Base class for all neural network modules.

    Your models should also subclass this class.

    Modules can also contain other Modules, allowing to nest them in
    a tree structure. You can assign the submodules as regular attributes::

        import torch.nn as nn
        import torch.nn.functional as F

        class Model(nn.Module):
            def __init__(self):
                super(Model, self).__init__()
                self.conv1 = nn.Conv2d(1, 20, 5)
                self.conv2 = nn.Conv2d(20, 20, 5)

            def forward(self, x):
                x = F.relu(self.conv1(x))
                return F.relu(self.conv2(x))

    Submodules assigned in this way will be registered, and will have their
    parameters converted too when you call :meth:`to`, etc.
    """

3.1.1 二维卷积Conv2d

PyTorch深度学习框架和Tensorflow,对于二维卷积api接口,不同之处在于:
PyTorch 源代码中已为开发者自动生成权重weight和偏置项bias(随机生成),而Tensorflow中需要开发者自定义weight和bias。

  • 示例
from __future__ import print_function
from torch import nn
import torch

x = torch.tensor([
    [1, 1, 3],
    [1, 1, 4],
    [1, -1, 0]
], dtype=torch.float)
print("x: \n", x)

x = x.reshape(1, 1, 3, 3)

conv = nn.Conv2d(1,1,2)

print(conv(x))
  • 打印结果
x:
 tensor([[ 1.,  1.,  3.],
        [ 1.,  1.,  4.],
        [ 1., -1.,  0.]])
peter self.weight:
 Parameter containing:
tensor([[[[ 0.4257,  0.0517],
          [-0.3757, -0.2100]]]], requires_grad=True)
peter self.bias:
 Parameter containing:
tensor([0.1355], requires_grad=True)
tensor([[[[ 0.0272, -0.4993],
          [ 0.4472,  1.1438]]]], grad_fn=<MkldnnConvolutionBackward>)

打印中有weight和bias,这两个打印是从source code中,print而来。验证卷积的第一个元素是对的,0.4257+0.0517-0.3757-0.2100+偏置项0.1355 = 0.0272。
在这里插入图片描述

扫描二维码关注公众号,回复: 13500049 查看本文章
  • 附source code修改
    \lib\site-packages\torch\nn\modules\conv.py修改如下:
from __future__ import print_function #在文件begin中导入这个包
#Conv2d Class如下两个接口中打印weight和bias
def _conv_forward(self, input, weight):
        print("peter self.bias: \n", self.bias)
        if self.padding_mode != 'zeros':
            return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
                            weight, self.bias, self.stride,
                            _pair(0), self.dilation, self.groups)
        return F.conv2d(input, weight, self.bias, self.stride,
                        self.padding, self.dilation, self.groups)

    def forward(self, input):
        print("peter self.weight: \n", self.weight)
        return self._conv_forward(input, self.weight)

修改的缘由:追踪到Conv2d Class继承了_ConvNd Class,父类里有属性weight和bias,截取相关片段代码

        if transposed:
            self.weight = Parameter(torch.Tensor(
                in_channels, out_channels // groups, *kernel_size))
        else:
            self.weight = Parameter(torch.Tensor(
                out_channels, in_channels // groups, *kernel_size))
        if bias:
            self.bias = Parameter(torch.Tensor(out_channels))
        else:
            self.register_parameter('bias', None)

3.1.2 激活函数ReLU

就是max(0,x),这个就不多做介绍了。看source code的example即可。

m = nn.ReLU()
input = torch.randn(2)
print("input: \n", input)
output = m(input)
print("\n relu output: \n", output)
PS H:\work\src> & D:/Anaconda/envs/pytorch36/python.exe h:/work/src/pytorch/basic/net.py
input:
 tensor([ 0.2909, -1.8363])

 relu output:
 tensor([0.2909, 0.0000])

3.1.3 池化层函数MaxPool2d

示例kenel_size 2*2,步长1

x = torch.tensor([
    [1, 1, 3, 2],
    [1, 1, 4, 3],
    [1, -1, 0, -2],
    [2, -2, -1, 4]
], dtype=torch.float)
print("x: \n", x)
x = x.reshape(1, 1, 4, 4)
conv = nn.Conv2d(1,1,2)
convx = conv(x)
print(convx)
maxpool = nn.MaxPool2d(2, padding = 0, stride=1)
print("the maxpoll2d of input convx: \n", maxpool(convx))

在这里插入图片描述

3.1.4 全连接层Linear

在这里插入图片描述
输入是x,输出是y,权重weight是A,偏置项bias是b。严格意义上的矩阵运算。
在这里插入图片描述

# 3. Linear
x = torch.tensor([
    [1, 0, 0],
    [0, 1, 0],
    [0, 0, 1],

], dtype=torch.float)
x = x.reshape(1, 1, 3, 3)
print("x: \n", x)
linear = nn.Linear(3,2)
print("linear: \n",linear(x))

在这里插入图片描述

3.1.5 Module父类

其属性变量和属性方法有很多。

  • parameters方法
    Returns an iterator over module parameters。yielding both the name of the parameter as well as the parameter itself。借助内置函数named_parameters。

3.2 Basic Api

改变Tensor shape的view、等。

3.2.1 改变Tensor shape的view

在这里插入图片描述

3.2.2 torch.optim.SGD

随机梯度下降(stochastic gradient descent)的接口;可选冲量式梯度下降。
属性参数如下:必设参数为模型params迭代器和学习率。

    Args:
        params (iterable): iterable of parameters to optimize or dicts defining
            parameter groups
        lr (float): learning rate
        momentum (float, optional): momentum factor (default: 0)
        weight_decay (float, optional): weight decay (L2 penalty) (default: 0)
        dampening (float, optional): dampening for momentum (default: 0)
        nesterov (bool, optional): enables Nesterov momentum (default: False)

基本用法示例:
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
>>> optimizer.zero_grad()
>>> loss_fn(model(input), target).backward()
>>> optimizer.step()
model模型类的实例对象;optimizer是SGD类的对象;调用zero_grad清零方法;。。。;最后调用step方法迭代。

3.2.3 torch.nn.CrossEntropyLoss

原理:Softmax函数和交叉熵损失函数。即如下
在这里插入图片描述
损失函数对变量参数进行求偏导,最小损失下,的参数值(通过反向传播backward)。
基本步骤示例:
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.randint(5, (3,), dtype=torch.int64)
>>> loss = F.cross_entropy(input, target)
>>> loss.backward()

3.2.4 保存模型torch.save

保存包含模型参数等模型类对象object到一个文件中,一般文件格式为pt(protocol buffer)。

def save(obj, f, pickle_module=pickle, pickle_protocol=DEFAULT_PROTOCOL, _use_new_zipfile_serialization=False)

模型在训练后,参数更新在模型类对象model中,然后save保存。
torch.save(model, ‘…/src/pytorch/LeNet5-MNIST-PyTorch-master/models/mnist_{:.2f}.pt’.format(correct / _sum))

3.2.5 torch.Tensor.detach

官方解释很形象:
Returns a new Tensor, detached from the current graph.The result will never require gradient.
在验证模型的时候,最好从当前计算图中分离开。
predict_y = model(test_x.float()).detach()

猜你喜欢

转载自blog.csdn.net/duanyuwangyuyan/article/details/109114931