Paddle: Load pre-trained weights and fine-tune fixed partial weights

1: Load pre-trained weights

Implemented through load and set_state_dict function interface calls

# save
paddle.save(net.state_dict(), "old_net.pdparams")
paddle.save(opt.state_dict(), "old_opt.pdopt")

# load
state_dict_net = paddle.load("old_net.pdparams")
state_dict_opt = paddle.load("old_opt.pdopt")

# match
new_net.set_state_dict(state_dict_net)
new_opt.set_state_dict(state_dict_opt)

Notice:

1: If the network structure is not completely consistent: Paddle will automatically skip the inconsistent layers.

2: Fixed weights for fine-tuning

Implemented by setting stop_gradient=True, which is more convenient than pytorch

Example 1:

For fixed network layers: such as: stage1→ stage2 → stage3, set the output of stage2, assuming it is y , set  y.stop_gradient=True, then, stage1→ stage2 as a whole is fixed and will no longer be updated.

Example 2:

cls0-cls9 are the 10 output branches of the network, and only the weight of the 7th branch is fine-tuned

3: Verify whether the weight fixing takes effect

View it through named_parameters, which is basically similar to pytorch

Example 1:

for item in net.named_parameters():
    if item[0] == 'cls0.0.bias':
        print("####", item[0], item[1][0])
    if item[0] == 'cls6.0.bias':
        print("!!!!", item[0], item[1][0])

reference:

1: Frequently Asked Questions about Parameter Adjustment-Usage Document-PaddlePaddle Deep Learning Platform

Guess you like

Origin blog.csdn.net/lilai619/article/details/128671590