Weight decay Weight Decade hands-on deep learning v2 pytorch

1. Weight Decade

insert image description here

The red point is the real y value. If w fluctuates rapidly, the blue line is the result, which will be overfitting; if the change of w is limited, it is the green line.
insert image description here
insert image description here
The intersection of the yellow circle and the green circle is a balance point. If w goes down, the reduced term in l yellow circle is insufficient to make up for the increased term in the rate circle. In general, w goes to the origin, and w* becomes smaller, so the complexity of the model becomes lower.

Weight decade general choice, 0.001, 0.01, 0.1 keep changing
insert image description here
insert image description here
insert image description here

2. The code is implemented from scratch

insert image description here
The simpler the data and the more complex the model, the easier it is to overfit.
insert image description here
insert image description here

insert image description here
insert image description here
insert image description here

3. Introduction to implementing pytorch

insert image description here
insert image description here

refer to

https://www.bilibili.com/video/BV1UK4y1o7dy?p=1

Guess you like

Origin blog.csdn.net/zgpeace/article/details/123888384