After calculating the loss previously, it is time to update:
1: netEMA is the generator of the model:
traverse the state_dict of the generator, and multiply the value corresponding to each key by EMA_decay.
Then calculate num_upd according to the current number of iteration steps, and execute it every 1000, 2500, and 10000 generation multiples.
When num_upd is greater than 50, it will jump out to update EMA.
Then colorize the image:
def visualize_batch(self, model, image, label, cur_iter):
self.save_images(label, "label", cur_iter, is_label=True)
self.save_images(image, "real", cur_iter)
with torch.no_grad():
model.eval()
fake = model.netG(label)
self.save_images(fake, "fake", cur_iter)
model.train()
if not self.opt.no_EMA:
model.eval()
fake = model.netEMA(label)
self.save_images(fake, "fake_ema", cur_iter)
model.train()
First, color the label:
the label, that is, the batch, has a label size of (5, 35, 256, 512) after one-hot encoding.
Then len(batch)=5, take the tensor corresponding to the first batch.
Coloring:
first obtain camp:
the generated colormap contains empty pixels and noise, and there are 36 categories in total, so execute the else statement.
First generate a list of all 0s with a size of (36,3) -> then traverse each category, initialize r=g=b=0 -> id = 1 -> then traverse 7 times, first convert the id to binary type.
def uint82bin(n, count=8):
"""returns the binary of integer n, count refers to amount of bits"""
return ''.join([str((n >> y) & 1) for y in range(count - 1, -1, -1)])
#y = 7,6,5,4,3,2,1,0
The values of y are 7, 6, 5, 4, 3, 2, 1, 0 respectively.
Shift n to the right by 7 bits, if n is 1, it will be 0 after the shift, and shift respectively. Only when y is equal to 0, do not shift, and n will be 1. Finally, return a string '00000001'.
Shift Operation
Take the reciprocal 1, 2, and 3 digits of str_id respectively. Then shift 1, 0, and 0 to the left by seven bits respectively, and after shifting 1 to the left, it becomes 1000 0000 in binary, that is, 128.0 is still 0 after shifting to the left, so r=128, g=b=0. Finally, id
=1 is shifted to the right by 3 bits , becomes 0.
Execute 8 times in the j loop, then the next id=0. In the uint82bin function, 0 is 0 no matter how many times it is shifted, and 0&1=0, so the final output is '00000000'. Then r =
128 ^(0)=128.
Because 128=(10000000), 0=(00000000), (1 XOR 0=1), (0 XOR 0=0), so 128^(0)=128. After executing this 7 times, r is filled with the first row and the first column, g is filled with the first row and the second column, and b is filled with the first row and the third column. In this way, the for loop is executed 36 times, and the camp will be refilled.
Convert camp to tensor. Generates a size of (3,256,512) filled with zeros. At the same time, find the category of one of the batch data of the label.
tens size changed from (35,256,512) to (1,256,512).
len(camp)=36, when starting label=0, tens[0]=(256,512), label==tens[0] will get a mask, where the category equal to 0 in tens is True, and the category not equal to 0 is false .
color_image[0] obtains the first layer R channel of color_image, cmap[label][0] is the first row and first column, which is 128, and replaces all the values corresponding to the mask with 128. The same is true for G and B channels. This loops 36 times to color each category. Finally output the filled color map.
Finally, the label is transposed to facilitate cv2 storage.
Finally, the remaining four pictures of the batch are also processed. Put five pictures on one picture and save it to the specified location.
The next step is to process the image:
set tens less than 0 to 0, and set tens greater than 1 to 1. Then transpose to (h, w, c) format.
During eval, input the label into the generator to generate a fake image with a size of (5,3,256,512). Save the generated fake image.
netEMA is a deep copy of the generator.
The next step is to calculate the time required to train a batch:
write the time spent on epoch, total epoch, and current iteration to the progress.txt file and print it out.
Next step:
save weights by controlling latest, best.
The most important thing is to look at the FID calculation: it is more troublesome, and a new chapter will be opened at that time.