The first few batches of an Epoch are trained normally, and the data of the last batch is insufficient and an error is reported

Table of contents

question:

Solution 1:

Solution 2:


question:

In the process of model training, the first few rounds of batch data of an epoch can be trained normally to output loss, and the error is reported in the last round of batch data. The high probability is that the amount of data does not match the epoch, resulting in the data of the last batch not being divisible. the problem.


Solution 1:

Manually adjust the parameters of epoch to ensure that all parameters in num-data/ batchz-size= epoch are integers.

Solution 2:

Delete the data of the last batch and do not participate in training. The specific operation is to set the drop_last parameter to True when defining the dataloader.

torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=None, sampler=None, batch_sampler=None, num_workers=0, collate_fn=None, pin_memory=False, drop_last=False, timeout=0, worker_init_fn=None, multiprocessing_context=None, generator=None, *, prefetch_factor=None, persistent_workers=False, pin_memory_device='')


#drop_last:设置为True表示在数据量与batch_size不能被整除的情况下,删除不完整的batch数据;默认设置为False

drop_last (booloptional) – set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default: False)

Refer to the official website definition: torch.utils.data.DataLoader

It's not easy to organize, welcome to one-click three links!!!

Guess you like

Origin blog.csdn.net/qq_38308388/article/details/131041955