8521Accesses
1Altmetric
Abstract
In this paper, we generalize the well-known Nesterov’s accelerated gradient (AG) method, originally designed for convex smooth optimization, to solve nonconvex and possibly stochastic optimization problems. We demonstrate that by properly specifying the stepsize policy, the AG method exhibits the best known rate of convergence for solving general nonconvex smooth optimization problems by using first-order information, similarly to the gradient descent method. We then consider an important class of composite optimization problems and show that the AG method can solve them uniformly, i.e., by using the same aggressive stepsize policy as in the convex case, even if the problem turns out to be nonconvex. We demonstrate that the AG method exhibits an optimal rate of convergence if the composite problem is convex, and improves the best known rate of convergence if the problem is nonconvex. Based on the AG method, we also present new nonconvex stochastic approximation methods and show that they can improve a few existing rates of convergence for nonconvex stochastic optimization. To the best of our knowledge, this is the first time that the convergence of the AG method has been established for solving nonconvex nonlinear programming in the literature.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
A function\(\varPsi (\cdot )\) satisfying (1.2) is denoted by\(\varPsi \in {\mathcal {C}}_{L_\varPsi }^{1,1}(\mathbb {R}^n)\).
References
Andradóttir, S.: A review of simulation optimization techniques. In: Proceedings of the 1998 Winter Simulation Conference, pp. 151–158 (1998)
Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algorithm and Analysis. Springer, New York (2000)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci.2, 183–202 (2009)
Cartis, C., Gould, N.I.M., Toint, P.L.: On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization. SIAM J. Optim.20(6), 2833–2852 (2010)
Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained\(l_2-l_p\) minimization. Math. Program. (2012). doi:10.1007/s10107-012-0613-0
Dang, C.D., Lan, G.: Stochastic block mirror descent methods for nonsmooth and stochastic optimization. Manuscript, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA (2013)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc.96, 1348–1360 (2001)
Feng, M., Mitchell, J.E., Pang, J.-S., Shen, X., Wächter, A.: Complementarity formulations of\(l_0\)-norm optimization problems. Manuscript (2013)
Fu, M.: Optimization for simulation: theory vs. practice. INFORMS J. Comput.14, 192–215 (2002)
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, I: a generic algorithmic framework. SIAM J. Optim.22, 1469–1492 (2012)
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: shrinking procedures and optimal algorithms. SIAM J. Optim.23, 2061–2089 (2013)
Ghadimi, S., Lan, G.: Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim.23(4), 2341–2368 (2013)
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for constrained nonconvex stochastic programming. Manuscript, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA. Mathematical Programming (Under second-round review) (2014)
Lemarechal, C., Hiriart-Urruty, J.-B.: Convex Analysis and Minimization Algorithms II. A Series of Comperhensive Studies in Mathematics. Springer Science & Business Media, Heidelberg (1993)
Lan, G.: An optimal method for stochastic composite optimization. Math. Program.133(1), 365–397 (2012)
Lan, G.: The complexity of large-scale convex programming under a linear optimization oracle. Manuscript, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA.http://www.optimization-online.org/ (2013)
Lan, G., Monteiro, R.D.C.: Iteration-complexity of first-order penalty methods for convex programming. Math. Program.138, 115–139 (2013)
Lan, G., Monteiro, R.D.C.: Iteration-complexity of first-order augmented lagrangian methods for convex programming. Technical report, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA. Mathematical Programming (Under second-round review) (2013)
Law, A.M.: Simulation Modeling and Analysis. McGraw Hill, New York (2007)
Lewis, A.S., Wright, S.J.: A proximal method for composite minimization. Manuscript, Cornell University, Ithaca, NY (2009)
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: ICML, pp. 689–696 (2009)
Monteiro, R.D.C., Svaiter, B.F.: An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. Manuscript, School of ISyE, Georgia Tech, Atlanta, GA, 30332, USA (2011)
Nemirovski, A.S., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim.19, 1574–1609 (2009)
Nemirovski, A.S., Yudin, D.: Problem complexity and method efficiency in optimization. In: Graham, R.L., Lenstra, J.K. (eds.) Wiley-Interscience Series in Discrete Mathematics. Wiley, XV (1983)
Nesterov, Y.E.: A method for unconstrained convex minimization problem with the rate of convergence\(O(1/k^2)\). Doklady AN SSSR269, 543–547 (1983)
Nesterov, Y.E.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Massachusetts (2004)
Nesterov, Y.E.: Smooth minimization of nonsmooth functions. Math. Program.103, 127–152 (2005)
Nesterov, Y.E.: How to make gradients small. Optima88, 10–11 (2012)
Nesterov, Y.E.: Gradient methods for minimizing composite objective functions. Math. Program. Ser. B140, 125–161 (2013)
Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim.30, 838–855 (1992)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat.22, 400–407 (1951)
Sartenaer, A., Gratton, S., Toint, P.L.: Recursive trust-region methods for multiscale nonlinear optimization. SIAM J. Optim.19, 414–444 (2008)
Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley, Hoboken (2003)
Tseng, P.: On accelerated proximal gradient methods for convex–concave optimization. Manuscript, University of Washington, Seattle (May 2008)
Author information
Authors and Affiliations
Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, 32611, USA
Saeed Ghadimi & Guanghui Lan
- Saeed Ghadimi
You can also search for this author inPubMed Google Scholar
- Guanghui Lan
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toGuanghui Lan.
Additional information
This research was partially supported by NSF Grants CMMI-1000347, CMMI-1254446, DMS-1319050, and ONR Grant N00014-13-1-0036.
Rights and permissions
About this article
Cite this article
Ghadimi, S., Lan, G. Accelerated gradient methods for nonconvex nonlinear and stochastic programming.Math. Program.156, 59–99 (2016). https://doi.org/10.1007/s10107-015-0871-8
Received:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative