三、pytorch学习笔记之迁移学习

1. 总述:在实践中,很少人从头开始训练整个大型神经网络,因为个人很难掌握大量的数据集,这样即使从头开始训练,得到的网络也不一定让人满意。因此,在一个非常大的数据集上与训练Convnet是很有必要的,经过预训练的ConvNet可以用来初始化也可以作为特征提取器,接下来介绍集中迁移学习的思路。


    1.1ConvNet作为固定特征处理器:下载一个已经在ImageNet或者其他大型数据集上预训练的Convnet,删除最有一个全连接层,然后添加需要的连接层,这样做是因为原网络的输出可能跟你所需要的不一致,而只需修改全连接层的输出维度即可。具体实现过程后续给出。

    1.2微调(fine-tuning)ConvNet,你可以有选择性的冻结已经预训练的Convnet的前几层或者保留所有层,利用自己手中的数据集对该网络进行权重更新和优化。

2. 如何确定是否应该进行fine-tuning 或者只是当做固定特征提取器?

    2.1若新数据集很小,且和原始数据集很相似,选择Convnet作为固定特征提取器是更好的选择

    2.2若新数据集很大,但和原始数据集很相似,可以尝试fine-tuning

    2.3若新数据集很小,但和原始数据集有很大不同,最好的选择应该是保留CNN层训练一个线性分类器

    2.4若新数据集很大,与原始数据集有很大不同,尽管去尝试fine-tuning以提高原网络的适应性

3.模型再训练优化:

 
 
def train_model(model,criterion,optimizer,scheduler,num_epochs=25:
           #参数:model---需要优化训练的模型,criterion---损失函数计算标准,optimizer---优化函数,scheduler---学习率调度
            

          

扫描二维码关注公众号,回复: 910687 查看本文章
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())#复制原模型的参数
    best_acc = 0.0
    
  for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:       #phase包含训练集和测试集,操作不一致
            if phase == 'train':
                scheduler.step()   #启动学习率衰减
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':  #对于训练集需要进行参数优化,而测试集则无此必要
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)  #计算所有batch的损失和
                running_corrects += torch.sum(preds == labels.data)#计算所有batch的精确度

            epoch_loss = running_loss / dataset_sizes[phase] #处理样本总数得到平均的loss
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc  #获取测试集的最高准确率
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))#输出最佳准确率

    # load best model weights
    model.load_state_dict(best_model_wts) #保留最佳准确时的参数状态
    return mode

4.fine-tunning:以resnet18为例进行微调。


model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 2) #修改全连接层,并保留前面所有层,进行优化训练

model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)#设置学习率衰减  lr:=lr×power(gamma,epoch/step_size)
model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
                       num_epochs=25)

5.Convnet作为固定特征提取器,只训练最后一层,通过require_grad=False冻结resnet18的早期层

model_conv = torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():
    param.requires_grad = False

# Parameters of newly constructed modules have requires_grad=True by default
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, 2)

model_conv = model_conv.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that only parameters of final layer are being optimized as
# opoosed to before.
optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)
model_conv = train_model(model_conv, criterion, optimizer_conv,
                         exp_lr_scheduler, num_epochs=25)
 

猜你喜欢

转载自blog.csdn.net/qq_40103460/article/details/80341727