【Pytorch编程】Pytorch入门学习相关基础概念与初体验

我的博客专栏Pytorch编程系列文章。Python环境配置参考《【Python学习】Windows10开始你的Anaconda安装与Python环境管理》或者《【Python学习】纯终端命令开始你的Anaconda安装与Python环境管理》

作者: 陈艺荣
代码环境: Python3.6、Pytorch1.4.0、jupyter notebook

参考资源

环境配置

在进行Pytorch环境配置前,需要确定自己的开发条件:

  • 系统:Linux、Mac、Windows
  • 包管理工具:Conda、Pip、…
  • 语言环境:Python、C++、Java
  • 计算资源:是否有显卡、显卡型号、显卡版本、CUDA版本

上述条件根据每个人的实际情况而有所不同。

以我本人为例,使用的是服务器,Ubuntu18.04,包管理工具为Conda,显卡版本为GeForce RTX 2080ti,nvidia驱动版本为495.46,支持的cuda版本最高可至11.5。

通常来说,系统为固定的,计算资源当中的显卡型号为固定的,我们可以根据显卡型号来确定可支持的nvidia驱动,根据nvidia驱动版本确定可支持的cuda版本,可以在网站https://www.nvidia.cn/geforce/drivers/ 检索适合自己的nvidia驱动,参考【Python学习】Ubuntu18.04从零开始安装CUDA与cuDNN 进行驱动配置。

Python环境配置参考《【Python学习】Windows10开始你的Anaconda安装与Python环境管理》或者《【Python学习】纯终端命令开始你的Anaconda安装与Python环境管理》

一旦完成驱动环境配置,就可以使用包管理工具安装Pytorch了,安装命令看起来如下:

  • 在Ubuntu18.04,cuda10.1配置下安装pytorch1.4.0
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
  • 在Ubuntu18.04,cuda10.2配置下安装pytorch1.6.0
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch

这些命令可以在pytorch官网找到,详情可参考链接:https://pytorch.org/get-started/previous-versions/

查看当前conda环境使用的Pytorch版本

import torch
print(torch.__version__)  #注意是双下划线
1.4.0

简单体验Pytorch

导入相关包

torch为顶层包,其中常用的包有:

import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

下载FashionMNIST数据集

该数据集参考https://pytorch.org/docs/1.4.0/torchvision/datasets.html#fashion-mnist
运行以下命令,会在本文件所在的目录下创建命名为data的目录,然后在data目录下创建FashionMNIST目录,所以,存储数据集的目录为:

./data/FashionMNIST
# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data", # 指定下载的数据集存储的根目录
    train=True, # 下载训练集
    download=True, # If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data", # 指定下载的数据集存储的根目录
    train=False,# 下载测试集
    download=True,
    transform=ToTensor(),
)
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz



HBox(children=(HTML(value=''), FloatProgress(value=1.0, bar_style='info', layout=Layout(width='20px'), max=1.0…


Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz



HBox(children=(HTML(value=''), FloatProgress(value=1.0, bar_style='info', layout=Layout(width='20px'), max=1.0…


Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz




HBox(children=(HTML(value=''), FloatProgress(value=1.0, bar_style='info', layout=Layout(width='20px'), max=1.0…


Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz



HBox(children=(HTML(value=''), FloatProgress(value=1.0, bar_style='info', layout=Layout(width='20px'), max=1.0…


Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Processing...
Done!

查看数据集维度

torch.utils.data.DataLoader为数据加载类,如下所示:

DataLoader(dataset, batch_size=1, shuffle=False, sampler=None,
       batch_sampler=None, num_workers=0, collate_fn=None,
       pin_memory=False, drop_last=False, timeout=0,
       worker_init_fn=None)

其中,dataset指定了torch.utils.data.Dataset类或其子类的对象,batch_size指定了数据读取批次。

在进行神经网络训练之前,需要创建数据加载对象,如下所示,初始化了两个数据加载对象:train_dataloader、test_dataloader。

batch_size = 64

# 创建训练集和测试集的数据加载对象
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break
Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64

定义神经网络

定义神经网络并不复杂,其本质就是创建一个类,该类需要继承父类torch.nn.Module,因此,torch.nn.Module也被称为所有神经网络模块的基类。其定义格式如下:

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        # 初始化的模型模块

    def forward(self, x):
        # 模型的前向传递
        return logits
# 利用torch.cuda.is_available()判断GPU是否可用,从而确定device选项
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

# 创建神经网络类,该类需要继承父类nn.Module
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten() # 把一个数据拉成一维,相当于torch.nn.Flatten(start_dim=1, end_dim=-1)
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512), # 线性变换模块,输入为[batch_size,28*28],输出为[batch_size,512]
            nn.ReLU(), # 激活函数模块
            nn.Linear(512, 512), # 线性变换模块,输入为[batch_size,512],输出为[batch_size,512]
            nn.ReLU(), # 激活函数模块
            nn.Linear(512, 10) # 线性变换模块,输入为[batch_size,512],输出为[batch_size,10]
        )

    def forward(self, x):
        x = self.flatten(x) # 从x的第二维开始拉成一维,[64, 1, 28, 28]--->[64, 1*28*28]
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)
Using cuda device
NeuralNetwork(
  (flatten): Flatten()
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)

计算模型总参数以及可训练参数

def count_trainable_parameters(model):
    '''获取需要训练的参数数量
    使用示例:print(f'The model has {count_trainable_parameters(model):,} trainable parameters')
    '''
    return sum(p.numel() for p in model.parameters() if p.requires_grad) 

def count_total_parameters(model):
    '''获取模型总的参数数量
    使用示例:print(f'The model has {count_total_parameters(model):,} total parameters')
    '''
    return sum(p.numel() for p in model.parameters()) 

total_params = count_total_parameters(model)
print(f'{total_params:,} total parameters.')
total_trainable_params = count_trainable_parameters(model)
print(f'{total_trainable_params:,} total trainable parameters.')
669,706 total parameters.
669,706 total trainable parameters.

定义损失函数和优化器

loss_fn = nn.CrossEntropyLoss()  # nn.LogSoftmax() 与 nn.NLLLoss() 的组合
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3) # 使用torch.optim.SGD优化器,固定学习率为0.001

定义训练过程和测试过程

训练需要训练数据加载器、模型、损失函数以及优化器,其过程可以概括为:

  • 数据加载与读取
  • 调用模型计算
  • 使用损失函数计算loss值
  • 将梯度初始化为零,然后使用loss进行反向传递,更新所有参数

测试过程与训练过程类似,但不包括后面两步。

# 训练过程
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
# 测试过程       
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

进行模型训练

epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")
Epoch 1
-------------------------------
loss: 2.312360  [    0/60000]
loss: 2.291506  [ 6400/60000]
loss: 2.271516  [12800/60000]
loss: 2.252658  [19200/60000]
loss: 2.244356  [25600/60000]
loss: 2.222664  [32000/60000]
loss: 2.228115  [38400/60000]
loss: 2.203855  [44800/60000]
loss: 2.192679  [51200/60000]
loss: 2.153655  [57600/60000]
Test Error: 
 Accuracy: 38.2%, Avg loss: 2.149532 

Epoch 2
-------------------------------
loss: 2.169748  [    0/60000]
loss: 2.151293  [ 6400/60000]
loss: 2.094591  [12800/60000]
loss: 2.099757  [19200/60000]
loss: 2.060622  [25600/60000]
loss: 2.005261  [32000/60000]
loss: 2.027278  [38400/60000]
loss: 1.956725  [44800/60000]
loss: 1.939770  [51200/60000]
loss: 1.872965  [57600/60000]
Test Error: 
 Accuracy: 57.6%, Avg loss: 1.870182 

Epoch 3
-------------------------------
loss: 1.906271  [    0/60000]
loss: 1.869101  [ 6400/60000]
loss: 1.755461  [12800/60000]
loss: 1.784045  [19200/60000]
loss: 1.698693  [25600/60000]
loss: 1.647050  [32000/60000]
loss: 1.658152  [38400/60000]
loss: 1.569280  [44800/60000]
loss: 1.572960  [51200/60000]
loss: 1.469435  [57600/60000]
Test Error: 
 Accuracy: 63.3%, Avg loss: 1.492781 

Epoch 4
-------------------------------
loss: 1.561221  [    0/60000]
loss: 1.521496  [ 6400/60000]
loss: 1.376429  [12800/60000]
loss: 1.439268  [19200/60000]
loss: 1.347573  [25600/60000]
loss: 1.329434  [32000/60000]
loss: 1.343370  [38400/60000]
loss: 1.273368  [44800/60000]
loss: 1.297012  [51200/60000]
loss: 1.199990  [57600/60000]
Test Error: 
 Accuracy: 64.4%, Avg loss: 1.230259 

Epoch 5
-------------------------------
loss: 1.306296  [    0/60000]
loss: 1.285495  [ 6400/60000]
loss: 1.123326  [12800/60000]
loss: 1.221825  [19200/60000]
loss: 1.122804  [25600/60000]
loss: 1.131946  [32000/60000]
loss: 1.158367  [38400/60000]
loss: 1.095908  [44800/60000]
loss: 1.125929  [51200/60000]
loss: 1.049075  [57600/60000]
Test Error: 
 Accuracy: 65.0%, Avg loss: 1.070740 

Done!

保存模型

import os

SAVE_PATH = "./about_pytorch_model"
if not os.path.exists(SAVE_PATH):
    os.makedirs(SAVE_PATH)

torch.save(model.state_dict(), os.path.join(SAVE_PATH,"model.pth"))
print("Saved PyTorch Model State to model.pth")
Saved PyTorch Model State to model.pth

加载模型

model = NeuralNetwork()
model.load_state_dict(torch.load(os.path.join(SAVE_PATH,"model.pth")))
<All keys matched successfully>

测试模型

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')
Predicted: "Ankle boot", Actual: "Ankle boot"

猜你喜欢

转载自blog.csdn.net/m0_37201243/article/details/123586440