Gradient Descent gradient descent
Gradient descent is an iterative process (as opposed to the least squares method), the goal is to solve the optimization problem: \ ({\ Theta} ^ * = Arg min _ {\ Theta} L ({\ Theta}) \) wherein \ ({\ theta} \) is a vector, partial differential gradient.
In order to achieve better results gradient descent, following these Tips:
- Adjust the learning rate
at the beginning of time, larger learning rate for faster iteration, when close to the target, the learning rate is adjusted smaller.
For example, by \ (. 1 / T \) : attenuation \ (^ t = {\ eta
} / \ sqrt {(t + 1)} \ {\ eta}) In addition, different parameters should be given a different learning rate.