The meaning of few-shot learning (Few-Shot Learning) training parameters

1. Conventional parameters

1.1 epoch

       This means that all training data must be run through. Suppose there are 6400 samples. During the training process, an epoch will be considered only after all 6400 samples have been run. Generally, experiments require training for many epochs and will not stop until the LOSS is stable.

1.2 batch_size

        The Chinese name is the batch size. The previous 6400 samples were sent in. If a sample is sent in, the weight of the network will be updated, which is online learning. Correspondingly, we can send the data of an epoch into the network in batches, which can speed up the training time. How many samples are sent each time is batch_size. Assuming that 6400 samples are divided into 50 times and sent in, then 128 samples will be sent in each time, that is, batch_size=128.

1.3 iteration

        All the data of an epoch is divided into many batches. The number of batches is iteration. According to the above assumption, iteration=50. After each iteration, the parameters are updated once, which can be understood as the number of batches.

        Total number of samples=batch_size * iteration

2. Small sample training parameters

2.1 episode

        (Equivalent to batch in traditional training) Small sample learning will be divided into many training times during training, and each time some categories will be randomly selected from all categories to train (sampling without replacement). Generally, an epoch contains multiple episodes. An episode is the process of selecting support set and query set categories at one time, that is, training the model once with certain selected categories, and then selecting several other categories to train the model in the next episode. The episode is internally composed of two parts: support set and query set.

2.1.1 support set

        When building an episode, each episode will randomly select some categories from the full amount of data, such as C categories; after selecting C categories, K samples will be randomly selected in each category, which constitutes the support set. This contains a total of C * K samples. Generally speaking, the model will be trained on this. The mission constructed in this way is C-way K-shot.

2.1.2 query set

        Similar to the support set, D categories will be selected from the remaining data set samples (note that the categories here are selected from the corresponding spport set categories, D≤C) and some samples will be sampled as the query set. Generally speaking, the model is in the support set After the set is trained once, the loss will be obtained on the query set.

2.2 task

        An episode contains a support set and a query set. An episode corresponds to allowing the model to have better prediction performance on the query set after learning on the support set. Therefore, an episode corresponds to a task.

2.3 meta trainig set

        Generally speaking, according to the size of the training data, multiple training episodes can be constructed, and these episodes can be called meta-training sets.

2.4 meta test set

        Because a model has been trained on several tasks (that is, several episodes) in the meta training set, we hope that the model can also perform better on some new tasks. Therefore, the meta test set is completely consistent with the meta training set in terms of data structure, and also includes a support set and a query set. The basic idea is to hope that the model can quickly grasp the essence of the problem based on the support set under new tasks, so that it can quickly Better results have been achieved on the query set in the meta test set.

3. Summary

        Generally speaking, the basic unit of this type of meta-learning is a task or an episode, which needs to be distinguished from ordinary training methods. The figure below is a 5-way 1-shot image classification problem. Each row is a task and a training episode. There are 5 categories in the support set, each category has 1 sample, and there are two in the test set. categories (the categories of the test set must be a subset of the corresponding training set), and one sample for each category. When testing the model training effect, it is hoped that the model can achieve better results on the tasks in the meta test set.

Guess you like

Origin blog.csdn.net/qq_31112205/article/details/128617043