We now study the general convex optimization problems. First, we consider the easiest case: no constraints. Namely, the optimization problem is
Recall that, the optimality condition for convex functions is
Suppose
For convenience, we assume that
Analogously to the simplex method, we would like to move from a solution
This inspired the so-called descent method: start from a solution
The first question is when we can stop? Of course the ideal stopping criterion is
The next question is, does this algorithm converge to an optimal solution? In fact, we claim that if we assume that
We assume
Let
Rigorously, let
By our assumption
In fact, it is not necessary to define
Suppose
Then for all
For our setting, just select an optimal solution
We now consider a specific descent method, the gradient descent, where we select
There is an advantage to choose
Applying this choice of directions, we obtain the gradient descent method:
We now consider how to choose the step size
Let's start from an easy example:
Next, consider the multivariate function
Let
Since
Assume
Note that in this proof we do not really need
However, for general cases, we cannot expect a universal condition for
Under which assumptions can we choose a constant as the step size?
We would like to avoid functions similar to
A function
We usually use
An
Recall that we hope
A function
We use the notation
Suppose
Note that if
An
Suppose
Recall that
Recall that, we hope to find the value of the step size
For an