1: Load pre-trained weights
Implemented through load and set_state_dict function interface calls
# save
paddle.save(net.state_dict(), "old_net.pdparams")
paddle.save(opt.state_dict(), "old_opt.pdopt")
# load
state_dict_net = paddle.load("old_net.pdparams")
state_dict_opt = paddle.load("old_opt.pdopt")
# match
new_net.set_state_dict(state_dict_net)
new_opt.set_state_dict(state_dict_opt)
Notice:
1: If the network structure is not completely consistent: Paddle will automatically skip the inconsistent layers.
2: Fixed weights for fine-tuning
Implemented by setting stop_gradient=True, which is more convenient than pytorch
Example 1:
For fixed network layers: such as: stage1→ stage2 → stage3, set the output of stage2, assuming it is y , set y.stop_gradient=True
, then, stage1→ stage2 as a whole is fixed and will no longer be updated.
Example 2:
cls0-cls9 are the 10 output branches of the network, and only the weight of the 7th branch is fine-tuned
3: Verify whether the weight fixing takes effect
View it through named_parameters, which is basically similar to pytorch
Example 1:
for item in net.named_parameters():
if item[0] == 'cls0.0.bias':
print("####", item[0], item[1][0])
if item[0] == 'cls6.0.bias':
print("!!!!", item[0], item[1][0])
reference: