We have analysed the convergence (and the rate) of the gradient descent with a fixed step size, where we should choose the step size
Note that a convex function restricted to any line is also convex. A naive idea is to improve the length of step so that the restricted function achieve its minimum value. Specifically, since
Consider
Applying the method of exact line search, successive gradient directions are always orthogonal.
Let
In the method of exact line search, we need to find the minimum point of a
Now we introduce some methods to find the zero points. Note that
A better method is to apply the so-called Newton's method. The idea is that given a value of
The Newton method is to do the approximation iteratively. Namely, we choose an arbitrary
Given a positive real number
Let
The algorithm is best known for its implementation in 1999 in Quake III Arena.
Suppose
Let
If we know that
In general, it is expensive to calculate the step size in exact line search, and we usually do not need to know the exact minimizer. It is sufficient to show that the value of the objective function decreases sufficiently at each step. So we consider a so-called backtracking line search method.
Armijo’s rule is a well-known and widely applied backtracking rule to update the step size. Given a descending direction
In particular, if we set
The intuition is that, we know
Armijo choose
We first give a lower bound of the step size. Initially set
Then, applying the lower bound of
Suppose