三、pytorch学习笔记之迁移学习

1. 总述：在实践中，很少人从头开始训练整个大型神经网络，因为个人很难掌握大量的数据集，这样即使从头开始训练，得到的网络也不一定让人满意。因此，在一个非常大的数据集上与训练Convnet是很有必要的，经过预训练的ConvNet可以用来初始化也可以作为特征提取器，接下来介绍集中迁移学习的思路。

1.1ConvNet作为固定特征处理器：下载一个已经在ImageNet或者其他大型数据集上预训练的Convnet，删除最有一个全连接层，然后添加需要的连接层，这样做是因为原网络的输出可能跟你所需要的不一致，而只需修改全连接层的输出维度即可。具体实现过程后续给出。

1.2微调(fine-tuning)ConvNet，你可以有选择性的冻结已经预训练的Convnet的前几层或者保留所有层，利用自己手中的数据集对该网络进行权重更新和优化。

2. 如何确定是否应该进行fine-tuning 或者只是当做固定特征提取器？

2.1若新数据集很小，且和原始数据集很相似，选择Convnet作为固定特征提取器是更好的选择

2.2若新数据集很大，但和原始数据集很相似，可以尝试fine-tuning

2.3若新数据集很小，但和原始数据集有很大不同，最好的选择应该是保留CNN层训练一个线性分类器

2.4若新数据集很大，与原始数据集有很大不同，尽管去尝试fine-tuning以提高原网络的适应性

3.模型再训练优化：

def train_model(model,criterion,optimizer,scheduler,num_epochs=25:

#参数：model---需要优化训练的模型，criterion---损失函数计算标准，optimizer---优化函数，scheduler---学习率调度

扫描二维码关注公众号，回复： 910687 查看本文章

    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())#复制原模型的参数
    best_acc = 0.0

  for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:       #phase包含训练集和测试集，操作不一致
            if phase == 'train':
                scheduler.step()   #启动学习率衰减
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':  #对于训练集需要进行参数优化，而测试集则无此必要
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)  #计算所有batch的损失和
                running_corrects += torch.sum(preds == labels.data)#计算所有batch的精确度

            epoch_loss = running_loss / dataset_sizes[phase] #处理样本总数得到平均的loss
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc  #获取测试集的最高准确率
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))#输出最佳准确率

    # load best model weights
    model.load_state_dict(best_model_wts) #保留最佳准确时的参数状态
    return mode

4.fine-tunning：以resnet18为例进行微调。

model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 2) #修改全连接层，并保留前面所有层，进行优化训练

model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)#设置学习率衰减  lr：=lr×power（gamma，epoch/step_size）

model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
                       num_epochs=25)

5.Convnet作为固定特征提取器，只训练最后一层，通过require_grad=False冻结resnet18的早期层

model_conv = torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():
    param.requires_grad = False

# Parameters of newly constructed modules have requires_grad=True by default
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, 2)

model_conv = model_conv.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that only parameters of final layer are being optimized as
# opoosed to before.
optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)

model_conv = train_model(model_conv, criterion, optimizer_conv,
                         exp_lr_scheduler, num_epochs=25)

三、pytorch学习笔记之迁移学习

猜你喜欢