Comparisons Are All You Need for Optimizing Smooth Functions
李彤阳,4 月 13 日 16:00
When optimizing machine learning models, there are various scenarios where gradient computations are challenging or even infeasible. Furthermore, in reinforcement learning (RL), preference-based RL that only compares between options has wide applications, including reinforcement learning with human feedback in large language models. In this paper, we systematically study optimization of a smooth function \(f\colon\R^n\to\R\) only assuming an oracle that compares function values at two points and tells which is larger. When \(f\) is convex, we give two algorithms using \(\tilde{O}(n/\epsilon)\) and \(\tilde{O}(n^{2})\) comparison queries to find an \(\epsilon\)-optimal solution, respectively. When \(f\) is nonconvex, our algorithm uses \(\tilde{O}(n/\epsilon^2)\) comparison queries to find an \(\epsilon\)-approximate stationary point. All these results match the best-known zeroth-order algorithms with function evaluation queries in \(n\) dependence, thus suggest that comparisons are all you need for optimizing smooth functions using derivative-free methods. In addition, we also give an algorithm for escaping saddle points and reaching an \(\epsilon\)-second order stationary point of a nonconvex \(f\), using \(\tilde{O}(n^{1.5}/\epsilon^{2.5})\) comparison queries.